Closed davecoates closed 2 years ago
@davecoates let me revisit the problem here to shed some light on it (apologies if you already knew all of what I'm about to explain):
Django's migration system serializes things at a point in time so they can be reconstructed. If you pass a model as an argument to a trigger, it will be serialized like this in the migration:
operations=[migrations.AddTrigger(name="something", model="myapp.MyModel")]
The problem with this is when MyModel
changes, the serialization of the trigger doesn't change. So the migration system doesn't detect that anything has changed in the trigger.
The best approach is to make it so that the trigger takes arguments that will cause the serialization to change. Your approach should work if the kwargs has all of the fields of the model serialized in it. The automatic trigger serialization code doesn't handle classes with **kwargs
in the __init__
, which is why you have to fill out this custom get_init_vals
.
I believe there's actually a subtle bug in pghistory that could be affecting you though, even with your current updated code. The Snapshot
objects in pghistory don't serialize all the fields by default, so it will lead to this problem unless you explicitly list them.
I don't know the full context of what you're tracking with pghistory, so it's hard to say for sure if this affects you. If you aren't explicitly listing models fields in the history tracker, I'd recommend turning on PGTRIGGER_INSTALL_ON_MIGRATE
to True
for now in your settings to make sure triggers are properly updated after migrations. My plans to address this are:
*args
and **kwargs
for custom triggers automatically. It should be straightforward and prevent people from having to do serialization stuff in custom triggersWill keep this open for now until I create the proper issues or address it
Just confirmed that there isn't a bug in the snapshot stuff in pghistory. The condition is tracked in the migrations even when no fields are specified, so the trigger continues to be migrated properly.
I still plan to follow up on 1) so that you can use **kwargs
. In the meantime an immediate approach you can take is to explicitly list the arguments of the trigger:
def __init__(
self,
*,
name=None,
operation=None,
condition=None,
when=None,
label=None,
snapshot=None,
event_model=None,
func=None
):
I think the above would be a better way of handling it for now, assuming it works for your use case
Thanks for the explanation - that helps. I'm a bit confused about pghistory Snapshot though - won't this suffer from a stale 'insert' trigger if you add a new field to a model? The 'update' trigger will be fine because condition
changes but for insert nothing would change in the arguments passed to pghistory.trigger.Event
.
I tried with an example from the docs by decorating a model with @pghistory.track(pghistory.Snapshot("test_model.snapshot"))
The initial migration created fine and after adding a new field the 'update' trigger was dropped and recreated but the insert was left untouched. Have I missed something here?
Apologies if this has progressed into pghistory territory - when I opened this I was thinking about the general case with pgtrigger. As a hypothetical say there was a bug in the trigger code for pgtrigger.SoftDelete
that required a change in the string returned from get_func
- what would be best way to handle that in regards to generating migrations in projects that were already using the buggy version?
The initial migration created fine and after adding a new field the 'update' trigger was dropped and recreated but the insert was left untouched. Have I missed something here?
No you are absolutely correct! Thanks for bringing this to my attention. I'm dedicating next week to pghistory and will plan on addressing this.
As a hypothetical say there was a bug in the trigger code for pgtrigger.SoftDelete that required a change in the string returned from get_func - what would be best way to handle that in regards to generating migrations in projects that were already using the buggy version?
If I understand, you're effectively asking about the same type of issue uncovered with pghistory, right? That is, if the underlying SQL changes but the class instantiation doesn't change, how does the migration system know to update it?
I think one way for me to solve this is to introduce a hash
argument that's supplied to the __init__
of the base trigger class. It's a totally useless argument that users shouldn't supply, but it's there for migration serialization and for the equality operator of those migration objects.
Triggers already have a unique hash that's computed based on the SQL and stored alongside the trigger object (it's how pgtrigger ls
knows what is outdated), so it's a pretty easy addition that can help make the migration integration more robust. It should also solve the issue of pghistory without any substantial changes.
I'm going to go ahead and integrate this into django-pgtrigger and mark this as a bug.
Basically what I'm going to do is design a test case around your use case - a custom trigger that implements get_func with no real changes to the __init__
. When the function changes, it should result in migrations being re-generated using that hash approach I just mentioned.
I'm going to try to do this in a way so that it doesn't break the API, probably by adjusting the migration operation so that it takes the hash instead of the trigger class taking the hash as an argument. Will see what I'm able to come up with and will let you know!
If I understand, you're effectively asking about the same type of issue uncovered with pghistory, right? That is, if the underlying SQL changes but the class instantiation doesn't change, how does the migration system know to update it?
Yep exactly that
I think one way for me to solve this is to introduce a hash argument that's supplied to the init of the base trigger class. It's a totally useless argument that users shouldn't supply, but it's there for migration serialization and for the equality operator of those migration objects.
That sounds good and sounds like it would resolve my issue without needing to implement get_init_vals
I'm going to try to do this in a way so that it doesn't break the API, probably by adjusting the migration operation so that it takes the hash instead of the trigger class taking the hash as an argument. Will see what I'm able to come up with and will let you know!
Excellent, thank you
@davecoates this has been fixed in version 4.5. When triggers are serialized to migrations, a different trigger compilation object is serialized instead of the main trigger class. This object is basically just a representation of the raw SQL.
The nice thing about this approach:
One caveat with this approach is that it re-creates all triggers when you make migrations. I didn't have too many ways to get around this, but thankfully I don't see a reason this would need to happen again. You should be able to fake the migration, although it's usually harmless to run it and update all the triggers.
The pgtrigger test suite dynamically makes models/migrations and updates them to verify things work. Along with this, I tested out this change with a few projects of mine. I serialized triggers the old way, switched versions, and then made migrations again. Things look good, so I'm going to mark this as fixed.
I've briefly tried this and seems to work great, thanks 👍
With the changes in version 3 that creates migrations I've encountered an issue with a custom trigger class that implement
get_func
to generate the trigger (it's for audit and so the trigger uses the fields currently defined on the model). If the fields on the model change thenget_func
will be different but it won't currently have migrations created (pgtrigger ls
will show it as outdated though).What's the best way to handle this? I've tried the following, which seems to work, but I'm wondering if I should approach it differently or whether it's something pgtrigger should handle internally: