Open sjoerdjob opened 1 week ago
Hi @sjoerdjob Thanks for the bug report. This looks like an interesting issue 😃
I think that django ticket you linked to is the right one relevant to this. It looks like the maintainers there are considering two options
In the case of (1) I imagine that would fix this issue. For (2) we may have to adjust our code to disconnect the signal at the right time.
Your first suggested improvement is a good idea, but I don't think it will fix this issue.
In the code that is calling post_save.connect, also add 'something'
Do you have any suggestions on what that 'something' should be?
I was going to say we would need to implement __del__
on the metaclass,
but then I found weakref.finalize which might be the right thing to use.
Using weakref.finalize
indeed does seem like the 'something' we might be looking for.
Do note the following comment though:
Note
It is important to ensure that func, args and kwargs do not own any references to obj, either directly or indirectly, since otherwise obj will never be garbage collected. In particular, func should not be a bound method of obj.
In this case, obj
would be the class for which the signal should be disconnected. However, looking at Django's 'disconnect', it seems clear that Django requires the caller to pass the same value for sender
as which was used to register the connect
call.
So the most pressing question here is: During the callback phase of weakref.finalize
, can we still get the class? First of all, we shouldn't keep an actual reference alive, as that would make the class never garbage collected. But we could keep a weakref alive, and hope that when the callback of weakref.finalize
gets called, that weakref could be used to resurrect the model class temporarily, just long enough to call post_save.disconnect
.
The weakref.finalize
is a bit of synthetic sugar around weakref.ref(obj, callback)
, which has the following comment:
If callback is provided and not None, and the returned weakref object is still alive, the callback will be called when the object is about to be finalized; the weak reference object will be passed as the only parameter to the callback; the referent will no longer be available.
Which makes me think that at the time the callback of weakref.finalize
is called, the model class is no longer obtainable, and as such it's too late to call disconnect()
. Initial experimentation shows that the weakref pointing to the class would be 'dead' by the time it's called.
As such, without any changes in Django itself, I think the only way forward would be to create a metaclass that registers a __del__
, which at least should be called when the class is still 'alive'. Just in time.
However, I myself personally don't have fun experiences with extending metaclasses for Django Model classes. There are quite some constraints there. In particular, Python will raise a TypeError
if multiple bases are used which define metaclasses, and there is not a 'most derived' metaclass:
https://docs.python.org/3/reference/datamodel.html#determining-the-appropriate-metaclass
Which makes using metaclasses also a less-than-ideal solution.
Describe the bug While running Django migrations, it can happen that subclasses of
DirtyFieldsMixin
get created (and later unreferenced, and thus destroyed). When a class gets unreferenced, the location that class holds in memory gets cleared up again, and a new Python object could be instantiated in that location. That new Python object might be another model class.To Reproduce
See the attached project for a minimal example. Running
python manage.py migrate
(after having installed the latest version of Django and django-dirtyfields) triggers the issue most of the time.minimal.tar.gz
Expected behavior Migrations always run completely pass.
Actual Behavior
Environment (please complete the following information):
django-dirtyfields version: 1.9.5 and newer.
Additional context
A related Django issue is https://code.djangoproject.com/ticket/35801#comment:6 . I'm not sure if either of the codebases can fix this bug separately though, or whether it requires changes in both projects.
Suggested improvements:
registered_post_save_signal
(or something) to make surepost_save.connect
is only called once per model class.post_save.connect
, also add 'something' that makes sure thatpost_save.disconnect(reset_state, sender=self.__class__, dispatch_uid=...)
gets called whenever the model class (not instance) gets unreferenced.