clokep / django-querysetsequence

Chain multiple (disparate) QuerySets in Django
https://django-querysetsequence.readthedocs.io/
ISC License
108 stars 25 forks source link

DRF breakage #107

Open optiz0r opened 1 month ago

optiz0r commented 1 month ago

I'm just going through an exercise of upgrading an internal app with very old versions to slightly less old and ran into some breakage relating to #88 which I think requires a code change here.

class NotificationGroupSerializer(serializers.ModelSerializer):
    class Meta:
        model = models.NotificationGroup
        exclude = ()

    members = GM2MSerializerField(queryset=autocomplete.QuerySetSequence(User.objects.all(), Group.objects.all()))

This worked fine in django 2.2 and drf 3.9. Having bumped upto django 3.2 and drf 3.15.2 (I'm still working on getting current), openapi schema generation started failing with infinite recursion errors.

Re-reading the QSS docs, I discover I need to specify model when creating the QuerySetSequence for some functionality to work. Changing the above to

    members = GM2MSerializerField(queryset=autocomplete.QuerySetSequence(User.objects.all(), Group.objects.all(), model=User))

Then starts failing with

  File "/home/brobert/.virtualenvs/myapp/lib64/python3.9/site-packages/rest_framework/fields.py", line 652, in <dictcomp>
    key: (copy.deepcopy(value, memo) if (key not in ('validators', 'regex')) else value)
  File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib64/python3.9/copy.py", line 161, in deepcopy
    rv = reductor(4)
TypeError: __getstate__() missing 1 required positional argument: 'self'

reductor is a reference to __reduce_ex__. After a bit of digging, I followed the definition of User.__reduce__, which lead me to the the inherited Model.__reduce__ from django's base Model class:

    def __reduce__(self):
        data = self.__getstate__()
        data[DJANGO_VERSION_PICKLE_KEY] = django.__version__
        class_id = self._meta.app_label, self._meta.object_name
        return model_unpickle, (class_id,), data

Meanwhile in #88, QSS' ProxyModel had been changed to:

class ProxyModel:
    """
    Wrapper for generating DoesNotExist exceptions without modifying
    the provided model. This is needed by DRF as the views only handle
    the specific exception by the model.
    """

    def __init__(self, model=None):
        self._model = model

        # Define DoesNotExist for abstract models.
        if self._model and not self._model._meta.abstract:
            self.DoesNotExist = model.DoesNotExist
        else:
            self.DoesNotExist = ObjectDoesNotExist

    def __getattr__(self, name):
        return getattr(self._model, name)

Long story short, this code assumes getattr will only ever be called to get DoesNotExist, and that it's safe to proxy calls for getattr on an instance to the class. I don't fully understand what DRF is doing such that I end up with getattr invoked on the ProxyModel instance, but it appears to have gone wrong because getattr was expecting to be called on an instance of User and actually got invoked on the User class itself.

From the context of #88 I think this __getattr__ override exists solely to return a valid DoesNotExist exception object. In which case a quick hack of the following resolves the crashes:

    def __getattr__(self, name):
        if name == 'DoesNotExist':
            return getattr(self._model, name)
        return super().__getattr__(self, name)

This proxies the request for the DoesNotExist object to the model, and relies on the default behaviour for anything else.

I'm not sure if this is the right thing to do in all cases; can someone more familiar with this codebase than me sanity check?

clokep commented 1 month ago

I can't promise I fully remember that PR from 2 years ago. 😄

This proxies the request for the DoesNotExist object to the model, and relies on the default behaviour for anything else.

I'm not sure if this is the right thing to do in all cases; can someone more familiar with this codebase than me sanity check?

That sounds like it would be reasonable, yes. Would an alternative be to only use the ProxyModel (i.e. a fake model) when the model parameter isn't used? I guess that doesn't really help for the "abstract model" problem.

Anyway -- if you have a unit test for this it would be greatly appreciated.

optiz0r commented 1 month ago

On second look, I'm not sure that __getattr__ is doing anything useful at all? The if statement above unconditionally sets self.DoesNotExist and I think the getattr method was only added to retrieve this from the model if it were not set, but it always is the method is unnecessary. The DRF schema generation also works if the getattr method is just commented out and I've not yet found anything else that's broken from doing so (also now tested on Django 4.2)

optiz0r commented 1 month ago

Scratch my last comment, the existing unit tests make calls to proxy._meta, so this does need to be proxied across to the model. But not all attributes should be, so __getattr__ needs to do something more nuanced here.

I've made a naive assumption that any special methods should be handled by the ProxyModel instance, and anything else can be proxied to the model. This allows the existing unit tests to pass, plus allows DRF/drf-spectacular schema generation to succeed.

I've added some new tests which verify the behaviour of ProxyModel's forwarding of calls to the model type class for special and non-special attribute lookups.

optiz0r commented 2 weeks ago

I've been running the linked PR in production for the last three weeks, and it seems stable.