netzkolchose / django-computedfields

Provides autogenerated autoupdated database fields for model methods.
MIT License
94 stars 14 forks source link

Django 4.1.6 and computed fields failing on accessing M2Ms for new instances (no ID?) #124

Closed striveforbest closed 1 year ago

striveforbest commented 1 year ago

[UPDATE] I did find this in your documentation: https://django-computedfields.readthedocs.io/en/latest/examples.html#many-to-many-fields

Which mostly answers the question. Just want to double check if it is the best way to handle this, update all our codebase to handle if not self.pk case. Is there no shortcut for this, like an argument, passed to @computed? It could be cool to have default_if_no_pk param or something.

-- Initial question -- After upgrading Django from 4.0.8 to 4.1.6, I cannot create model instances that have computed fields relying on an M2M relationship. I understand that the underlying issue is Django not being able to access M2M before the instance got the ID but I would think django-computedfields should handle it maybe?

Here is a place in Django's release notes explaining the new behavior: https://docs.djangoproject.com/en/4.1/releases/4.1/#reverse-foreign-key-changes-for-unsaved-model-instances

Here are my models:

class WorkLocation(TimeStampedModel):
    work = models.ForeignKey('Work', models.CASCADE, verbose_name='Work', related_name='locations')

class Work(TimeStampedModel, ComputedFieldsModel):
    #  ...  irrelevant fields, just using the reverse lookup to `WorkLocation`

    @computed(
        models.ForeignKey(to=WorkLocation, null=True, on_delete=models.SET_NULL, related_name='+', verbose_name='Computed Current Location'),
        depends=[
            ['locations', ['location_from', 'location_to', 'is_in_transit']],
        ],
    )
    def computed_current_location(self):
        """
        Computed field that stores most recent location object's id.
        """
        return self.locations.first()

Attempting to create a new Work instance fails with the following exception:

ValueError: 'Work' instance needs to have a primary key value before this relationship can be used.

I can easily replicate it via Django shell:

In [3]: artist = Artist.objects.first()
In [4]: work = Work.objects.create(artist=artist)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 work = Work.objects.create(artist=artist)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/manager.py:85, in BaseManager._get_queryset_methods.<locals>.create_method.<locals>.manager_method(self, *args, **kwargs)
     84 def manager_method(self, *args, **kwargs):
---> 85     return getattr(self.get_queryset(), name)(*args, **kwargs)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/query.py:671, in QuerySet.create(self, **kwargs)
    669 obj = self.model(**kwargs)
    670 self._for_write = True
--> 671 obj.save(force_insert=True, using=self.db)
    672 return obj

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:841, in Resolver.precomputed.<locals>.wrap.<locals>._save(instance, *args, **kwargs)
    840 def _save(instance, *args, **kwargs):
--> 841     new_fields = self.update_computedfields(instance, kwargs.get('update_fields'))
    842     if new_fields:
    843         kwargs['update_fields'] = new_fields

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:874, in Resolver.update_computedfields(self, instance, update_fields)
    872     update_fields.update(set(cf_mro))
    873 for fieldname in cf_mro:
--> 874     setattr(instance, fieldname, self._compute(instance, model, fieldname))
    875 if update_fields:
    876     return update_fields

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:581, in Resolver._compute(self, instance, model, fieldname)
    572 """
    573 Returns the computed field value for ``fieldname``.
    574 Note that this is just a shorthand method for calling the underlying computed
   (...)
    578 to the database, always use ``compute(fieldname)`` instead.
    579 """
    580 field = self._computed_models[model][fieldname]
--> 581 return field._computed['func'](instance)

File ~/projects/gestalt/noya/gagosian/works/models.py:1610, in Work.computed_current_location(self)
   1600 @computed(
   1601     models.ForeignKey(to=WorkLocation, null=True, on_delete=models.SET_NULL, related_name='+', verbose_name='Computed Current Location'),
   1602     depends=[
   (...)
   1605 )
   1606 def computed_current_location(self):
   1607     """
   1608     Computed field that stores most recent location object's id.
   1609     """
-> 1610     return self.locations.first()

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/manager.py:85, in BaseManager._get_queryset_methods.<locals>.create_method.<locals>.manager_method(self, *args, **kwargs)
     84 def manager_method(self, *args, **kwargs):
---> 85     return getattr(self.get_queryset(), name)(*args, **kwargs)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/related_descriptors.py:689, in create_reverse_many_to_one_manager.<locals>.RelatedManager.get_queryset(self)
    684 def get_queryset(self):
    685     # Even if this relation is not to pk, we require still pk value.
    686     # The wish is that the instance has been already saved to DB,
    687     # although having a pk value isn't a guarantee of that.
    688     if self.instance.pk is None:
--> 689         raise ValueError(
    690             f"{self.instance.__class__.__name__!r} instance needs to have a "
    691             f"primary key value before this relationship can be used."
    692         )
    693     try:
    694         return self.instance._prefetched_objects_cache[
    695             self.field.remote_field.get_cache_name()
    696         ]

ValueError: 'Work' instance needs to have a primary key value before this relationship can be used.

Here is a quick link to the Django source code where it now breaks: https://github.com/django/django/blob/4.1.5/django/db/models/fields/related_descriptors.py#L685

striveforbest commented 1 year ago

[UPDATE] Switching logic to return .pk in @computed field instead of the model instance solves the problem, but once again, wanted to double-check that's the correct approach:

    @computed(
        models.ForeignKey(to=WorkStatus, null=True, on_delete=models.SET_NULL, related_name='+', verbose_name='Computed Current Status'),
        depends=[
            ['_statuses', ['status_type']],
        ],
    )
    def computed_current_status(self):
        if self.pk and self._statuses.exists():
            return self._statuses.first().pk

-- Initial question And, while I have you, I can get an answer to another question. I updated all the @computed fields using m2m lookups to handle the no self.pk case, but now I am getting this exception:

TypeError: Field 'id' expected a number but got <WorkStatus: Sold>.

I've been using @computed(models.ForeignKey...) for a very long time, but it stopped working with an upgrade.

Similarly to the example above I have a computed field like this:

class WorkStatus(UpdateRelatedWorkModifiedTimeMixin, TimeStampedModel):
    work = models.ForeignKey('Work', models.CASCADE, verbose_name='Work', related_name='_statuses')

class Work(TimeStampedModel, ComputedFieldsModel):
    #  fields are irrelevant, we're just relying on a reverse lookup to `WorkStatus`

    @computed(
        models.ForeignKey(to=WorkStatus, null=True, on_delete=models.SET_NULL, related_name='+', verbose_name='Computed Current Status'),
        depends=[
            ['_statuses', ['status_type']],
        ],
    )
    def computed_current_status(self):
        if not self.pk:
            return None
        return self._statuses.first()

Here is a full traceback:

In [25]: status = WorkStatus.objects.last()

In [26]: work = Work.objects.last()

In [27]: status.work = work

In [28]: status.save()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/__init__.py:2018, in IntegerField.get_prep_value(self, value)
   2017 try:
-> 2018     return int(value)
   2019 except (TypeError, ValueError) as e:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'WorkStatus'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[28], line 1
----> 1 status.save()

File ~/projects/gestalt/noya/gagosian/works/models.py:312, in WorkStatus.save(self, **kwargs)
    310 def save(self, **kwargs):
    311     self.clean()
--> 312     super().save(**kwargs)

File ~/projects/gestalt/noya/gagosian/works/models.py:60, in UpdateRelatedWorkModifiedTimeMixin.save(self, **kwargs)
     59 def save(self, **kwargs):
---> 60     super().save(**kwargs)
     61     if getattr(self, 'work', None):
     62         self.work.save(update_fields=['modified'])

File ~/projects/gestalt/noya/gagosian/common/models.py:36, in TimeStampedModel.save(self, **kwargs)
     34 def save(self, **kwargs):
     35     self.update_modified = kwargs.pop('update_modified', getattr(self, 'update_modified', True))
---> 36     super(TimeStampedModel, self).save(**kwargs)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/base.py:812, in Model.save(self, force_insert, force_update, using, update_fields)
    809     if loaded_fields:
    810         update_fields = frozenset(loaded_fields)
--> 812 self.save_base(
    813     using=using,
    814     force_insert=force_insert,
    815     force_update=force_update,
    816     update_fields=update_fields,
    817 )

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/base.py:878, in Model.save_base(self, raw, force_insert, force_update, using, update_fields)
    876 # Signal that the save is complete
    877 if not meta.auto_created:
--> 878     post_save.send(
    879         sender=origin,
    880         instance=self,
    881         created=(not updated),
    882         update_fields=update_fields,
    883         raw=raw,
    884         using=using,
    885     )

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/dispatch/dispatcher.py:176, in Signal.send(self, sender, **named)
    170 if (
    171     not self.receivers
    172     or self.sender_receivers_cache.get(sender) is NO_RECEIVERS
    173 ):
    174     return []
--> 176 return [
    177     (receiver, receiver(signal=self, sender=sender, **named))
    178     for receiver in self._live_receivers(sender)
    179 ]

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/dispatch/dispatcher.py:177, in <listcomp>(.0)
    170 if (
    171     not self.receivers
    172     or self.sender_receivers_cache.get(sender) is NO_RECEIVERS
    173 ):
    174     return []
    176 return [
--> 177     (receiver, receiver(signal=self, sender=sender, **named))
    178     for receiver in self._live_receivers(sender)
    179 ]

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/handlers.py:83, in postsave_handler(sender, instance, **kwargs)
     81 # do not update for fixtures
     82 if not kwargs.get('raw'):
---> 83     active_resolver.update_dependent(
     84         instance, sender, kwargs.get('update_fields'),
     85         old=UPDATE_OLD.pop(instance, None),
     86         update_local=False,
     87         querysize=settings.COMPUTEDFIELDS_QUERYSIZE
     88     )

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:475, in Resolver.update_dependent(self, instance, model, update_fields, old, update_local, querysize)
    473 pks_updated: Dict[Type[Model], Set[Any]] = {}
    474 for queryset, fields in updates:
--> 475     _pks = self.bulk_updater(queryset, fields, return_pks=True, querysize=querysize)
    476     if _pks:
    477         pks_updated[queryset.model] = _pks

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:556, in Resolver.bulk_updater(self, queryset, update_fields, return_pks, local_only, querysize)
    554             change = []
    555     if change:
--> 556         self._update(queryset, change, fields)
    558 # trigger dependent comp field updates from changed records
    559 # other than before we exit the update tree early, if we have no changes at all
    560 # also cuts the update tree for recursive deps (tree-like)
    561 if not local_only and pks:

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/computedfields/resolver.py:569, in Resolver._update(self, queryset, change, fields)
    567 if self.use_fastupdate:
    568     return fast_update(queryset, change, fields, None)
--> 569 return queryset.model.objects.bulk_update(change, fields)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/manager.py:85, in BaseManager._get_queryset_methods.<locals>.create_method.<locals>.manager_method(self, *args, **kwargs)
     84 def manager_method(self, *args, **kwargs):
---> 85     return getattr(self.get_queryset(), name)(*args, **kwargs)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/query.py:905, in QuerySet.bulk_update(self, objs, fields, batch_size)
    903 with transaction.atomic(using=self.db, savepoint=False):
    904     for pks, update_kwargs in updates:
--> 905         rows_updated += queryset.filter(pk__in=pks).update(**update_kwargs)
    906 return rows_updated

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/query.py:1191, in QuerySet.update(self, **kwargs)
   1189 query.annotations = {}
   1190 with transaction.mark_for_rollback_on_error(using=self.db):
-> 1191     rows = query.get_compiler(self.db).execute_sql(CURSOR)
   1192 self._result_cache = None
   1193 return rows

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:1822, in SQLUpdateCompiler.execute_sql(self, result_type)
   1815 def execute_sql(self, result_type):
   1816     """
   1817     Execute the specified update. Return the number of rows affected by
   1818     the primary update query. The "primary update query" is the first
   1819     non-empty query that is executed. Row counts for any subsequent,
   1820     related queries are not available.
   1821     """
-> 1822     cursor = super().execute_sql(result_type)
   1823     try:
   1824         rows = cursor.rowcount if cursor else 0

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:1385, in SQLCompiler.execute_sql(self, result_type, chunked_fetch, chunk_size)
   1383 result_type = result_type or NO_RESULTS
   1384 try:
-> 1385     sql, params = self.as_sql()
   1386     if not sql:
   1387         raise EmptyResultSet

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:1797, in SQLUpdateCompiler.as_sql(self)
   1795 name = field.column
   1796 if hasattr(val, "as_sql"):
-> 1797     sql, params = self.compile(val)
   1798     values.append("%s = %s" % (qn(name), placeholder % sql))
   1799     update_params.extend(params)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:504, in SQLCompiler.compile(self, node)
    502 vendor_impl = getattr(node, "as_" + self.connection.vendor, None)
    503 if vendor_impl:
--> 504     sql, params = vendor_impl(self, self.connection)
    505 else:
    506     sql, params = node.as_sql(self, self.connection)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/functions/comparison.py:54, in Cast.as_postgresql(self, compiler, connection, **extra_context)
     50 def as_postgresql(self, compiler, connection, **extra_context):
     51     # CAST would be valid too, but the :: shortcut syntax is more readable.
     52     # 'expressions' is wrapped in parentheses in case it's a complex
     53     # expression.
---> 54     return self.as_sql(
     55         compiler,
     56         connection,
     57         template="(%(expressions)s)::%(db_type)s",
     58         **extra_context,
     59     )

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/functions/comparison.py:19, in Cast.as_sql(self, compiler, connection, **extra_context)
     17 def as_sql(self, compiler, connection, **extra_context):
     18     extra_context["db_type"] = self.output_field.cast_db_type(connection)
---> 19     return super().as_sql(compiler, connection, **extra_context)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/expressions.py:939, in Func.as_sql(self, compiler, connection, function, template, arg_joiner, **extra_context)
    937 for arg in self.source_expressions:
    938     try:
--> 939         arg_sql, arg_params = compiler.compile(arg)
    940     except EmptyResultSet:
    941         empty_result_set_value = getattr(
    942             arg, "empty_result_set_value", NotImplemented
    943         )

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:506, in SQLCompiler.compile(self, node)
    504     sql, params = vendor_impl(self, self.connection)
    505 else:
--> 506     sql, params = node.as_sql(self, self.connection)
    507 return sql, params

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/expressions.py:1388, in Case.as_sql(self, compiler, connection, template, case_joiner, **extra_context)
   1386 for case in self.cases:
   1387     try:
-> 1388         case_sql, case_params = compiler.compile(case)
   1389     except EmptyResultSet:
   1390         continue

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:506, in SQLCompiler.compile(self, node)
    504     sql, params = vendor_impl(self, self.connection)
    505 else:
--> 506     sql, params = node.as_sql(self, self.connection)
    507 return sql, params

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/expressions.py:1304, in When.as_sql(self, compiler, connection, template, **extra_context)
   1302 template_params["condition"] = condition_sql
   1303 sql_params.extend(condition_params)
-> 1304 result_sql, result_params = compiler.compile(self.result)
   1305 template_params["result"] = result_sql
   1306 sql_params.extend(result_params)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/sql/compiler.py:506, in SQLCompiler.compile(self, node)
    504     sql, params = vendor_impl(self, self.connection)
    505 else:
--> 506     sql, params = node.as_sql(self, self.connection)
    507 return sql, params

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/expressions.py:998, in Value.as_sql(self, compiler, connection)
    996 if output_field is not None:
    997     if self.for_save:
--> 998         val = output_field.get_db_prep_save(val, connection=connection)
    999     else:
   1000         val = output_field.get_db_prep_value(val, connection=connection)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/related.py:1146, in ForeignKey.get_db_prep_save(self, value, connection)
   1144     return None
   1145 else:
-> 1146     return self.target_field.get_db_prep_save(value, connection=connection)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/__init__.py:925, in Field.get_db_prep_save(self, value, connection)
    923 def get_db_prep_save(self, value, connection):
    924     """Return field's value prepared for saving into a database."""
--> 925     return self.get_db_prep_value(value, connection=connection, prepared=False)

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/__init__.py:2703, in AutoFieldMixin.get_db_prep_value(self, value, connection, prepared)
   2701 def get_db_prep_value(self, value, connection, prepared=False):
   2702     if not prepared:
-> 2703         value = self.get_prep_value(value)
   2704         value = connection.ops.validate_autopk_value(value)
   2705     return value

File ~/venv/noya-4UEHQ2KQ/lib/python3.8/site-packages/django/db/models/fields/__init__.py:2020, in IntegerField.get_prep_value(self, value)
   2018     return int(value)
   2019 except (TypeError, ValueError) as e:
-> 2020     raise e.__class__(
   2021         "Field '%s' expected a number but got %r." % (self.name, value),
   2022     ) from e

TypeError: Field 'id' expected a number but got <WorkStatus: Sold>.
jerch commented 1 year ago

@striveforbest Yes I also stumbled over this change in 4.1, hence my 4.1 docs patch. While I can follow the reasoning behind djangos change, it complicates things with cfs a bit. The idea behind the change boils down to "The instance is not yet in db, thus cannot have a related manager obj yet", while before django would have created an empty reverse manager.

Note that I cannot blueprint this case beforehand in outer cf logic, as cf does not know anything about the concrete function logic (I dont do function inspection, this is really beyond scope here). This means that you have to deal with the edge cases yourself:

Now on how to fix things from old code to 4.1 changes - I pretty much test all edge cases in test code, the only thing I had to patch, was adding a if self.pk condition to all occurrences (plus some default return value instead, see https://github.com/netzkolchose/django-computedfields/commit/b24a1e375e56d85a58d7ff5a8e9a5f4ca0dbd263)

Now to your concrete issues (untested, thus I might have overlooked something):

Hope this helps you to work around the issues, sadly I cannot do much about that new related manager behavior.

striveforbest commented 1 year ago

@jerch thanks for the comprehensive response, as always.

Those were my findings as well, it pretty much all boils down to 2 fixes needed:

Thanks for keeping up the great work!