While following the migration guide to update the django-auditlog package to version 3, I encountered an issue where the migration command did not fully migrate all LogEntry records.
Upon executing the migration command:
$ python manage.py auditlogmigratejson
Updated 101,000 records using Django operations.
There are 100,906 records that need migration.
I noticed that a significant number of LogEntry objects still had null values in the changes field, even though their changes_text fields were populated. This indicated that the migration did not affect these records.
Investigation:
While debugging, I examined the migrate_using_django method in auditlog.management.commands.auditlogmigratejson.Command:
def migrate_using_django(self, batch_size=None):
logs = self.get_logs()
if not batch_size:
return _apply_django_migration(logs)
total_updated = 0
for _ in range(ceil(logs.count() / batch_size)):
total_updated += _apply_django_migration(self.get_logs()[:batch_size])
return total_updated
When there are more than batch_sizeDELETE and CREATELogEntry objects that precede any UPDATE entries, self.get_logs() consistently returns the same initial set of records in each iteration. Since DELETE and CREATE entries do not require changes, the migration process becomes stuck on these records, leaving subsequent UPDATE entries unmigrated.
Possible Solution:
To address this issue, the batching logic should ensure that each batch processes a new subset of LogEntry objects. This can be achieved by implementing an offset mechanism or utilizing Django's Paginator for proper batch iteration. Here is a suggested fix using Paginator:
from django.core.paginator import Paginator
def migrate_using_django(self, batch_size=None):
logs = self.get_logs()
if not batch_size:
return _apply_django_migration(logs)
paginator = Paginator(logs, batch_size)
total_updated = 0
for page_number in paginator.page_range:
page = paginator.page(page_number)
total_updated += _apply_django_migration(page.object_list)
return total_updated
This modification ensures that each batch processes a unique set of LogEntry records, allowing the migration to progress through all entries without repetition.
Environment:
django-auditlog version: 2.x upgrading to 3.x
Django version: 4.2.15
Python version: 3.11
Database: PostgreSQL 15
Steps to Reproduce:
Have a database with a large number of LogEntry records, including many DELETE and CREATE entries preceding UPDATE entries.
Follow the migration guide to upgrade to django-auditlog version 3.
Run python manage.py auditlogmigratejson with batching enabled.
Observe that the migration does not update all LogEntry records.
Expected Behavior:
All LogEntry records should have their changes field populated after the migration, with no unmigrated entries remaining.
Actual Behavior:
A significant number of LogEntry records remain unmigrated, with null values in the changes field.
Workaround:
It is possible to avoid the issue by setting batch size to the amount of all existing LogEntry records.
I hope this report helps in identifying and resolving the issue. Please let me know if additional information is required.
Description:
While following the migration guide to update the
django-auditlog
package to version 3, I encountered an issue where the migration command did not fully migrate allLogEntry
records.Upon executing the migration command:
I noticed that a significant number of
LogEntry
objects still hadnull
values in thechanges
field, even though theirchanges_text
fields were populated. This indicated that the migration did not affect these records.Investigation:
While debugging, I examined the
migrate_using_django
method inauditlog.management.commands.auditlogmigratejson.Command
:The issue appears to stem from this line:
When there are more than
batch_size
DELETE
andCREATE
LogEntry
objects that precede anyUPDATE
entries,self.get_logs()
consistently returns the same initial set of records in each iteration. SinceDELETE
andCREATE
entries do not require changes, the migration process becomes stuck on these records, leaving subsequentUPDATE
entries unmigrated.Possible Solution:
To address this issue, the batching logic should ensure that each batch processes a new subset of
LogEntry
objects. This can be achieved by implementing an offset mechanism or utilizing Django'sPaginator
for proper batch iteration. Here is a suggested fix usingPaginator
:This modification ensures that each batch processes a unique set of
LogEntry
records, allowing the migration to progress through all entries without repetition.Environment:
django-auditlog
version: 2.x upgrading to 3.xSteps to Reproduce:
LogEntry
records, including manyDELETE
andCREATE
entries precedingUPDATE
entries.django-auditlog
version 3.python manage.py auditlogmigratejson
with batching enabled.LogEntry
records.Expected Behavior:
All
LogEntry
records should have theirchanges
field populated after the migration, with no unmigrated entries remaining.Actual Behavior:
A significant number of
LogEntry
records remain unmigrated, withnull
values in thechanges
field.Workaround:
It is possible to avoid the issue by setting batch size to the amount of all existing
LogEntry
records.I hope this report helps in identifying and resolving the issue. Please let me know if additional information is required.