PaulGilmartin / django-pgpubsub

A distributed task processing framework for Django built on top of the Postgres NOTIFY/LISTEN protocol.
Other
245 stars 12 forks source link

Ability to process notifications with backward incompatible changes in django models #38

Closed romank0 closed 1 year ago

romank0 commented 1 year ago

The Issue

Consider this scenario:

  1. Listener is not running.
  2. Model A has a TriggerChannel defined.
  3. It is updated and a Notification is stored
  4. A field F is added to the model A and a migration that populates that field with some values is added and applied so that the invariant is established that every entity has the value in that field.
  5. The listener is started.
  6. The listener deserializes the entity stored in the notification and F is empty.
  7. Listener invokes some code that assums the fiels is present and it is empty which may cause all sort of problems.

There are other possible cases when the migration breaks the listener for example change of the field type.

Especially this becomes a problem if the listener is down or cannot process by some reason for some prolonged time.

Possible solutions

Read the Latest if DB version has changed

When the entity is serialized the DB version (basically the nubmer of the latest django migration applied) is stored to the notification. During deserialization if the DB version is different instead of deserializing the json we just read the latest entity from DB via id. It might make sense to pass some metadata to the listeners which would indicate that the lates version was retrieved. Also it might happen that the entity is already deleted - it might make sense to add this to the metadata as well.

Store just entity id

This is the modification of the previous option. In this case the listener will always get the current version of the entity.

Store entity history and pass id and version in the Notification

In this solution instead of storing json in the notifications we store a reference to the entity snapshot data from from the history table that mirrors the main entity table structure similar to how this is done in simple-django-history (maybe https://github.com/Opus10/django-pghistory can be used - but I'm not familiar enough with it) and maybe even integrate with one of this.The idea is that the historical model should be automatically tracked and migrated with the main model (django-simple-history does this) so migrations will never break stored notifications.

Carefully plan migrations so that serialized form in JSON is backward compatible

This approach puts quite some burden on the developers but it is possible if there's a way to see that the older notifications are processed completely. One way to solve it is to store the DB version in the notification so that it can be queried when deciding if the next step in the migrations chain can be applied.