freelawproject / courtlistener

A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
https://www.courtlistener.com
Other
544 stars 150 forks source link

Django not setting value of auto_now field — How? #3359

Open sentry-io[bot] opened 11 months ago

sentry-io[bot] commented 11 months ago

This issue is coming up a bit lately and I'm baffled how it's possible. Django is supposed to set the values of auto_now fields before they hit the DB, but these failures indicate that that isn't happening sometimes. @flooie and @quevon have failed to reproduce it, and there is nobody else reporting it.

My guess is that it has to do with async, transactions or their intersection. Perhaps @ttys0dev has thoughts. It's rare, but it happens and it could reflect a bug in Django or another library.

Sentry Issue: COURTLISTENER-5D5

IntegrityError: null value in column "date_modified" of relation "search_docket" violates not-null constraint

NotNullViolation: null value in column "date_modified" of relation "search_docket" violates not-null constraint
DETAIL:  Failing row contains (67983939, 2023-11-07 02:31:50.290272+00, null, null, null, null, , WeWork Canada GP ULC, , wework-canada-gp-ulc, 23-19866, 2023-11-06, t, njb, null, null, 8885, , 2023-11-06, null, null, , , , , , 1105231, null, 1, John K. Sherwood, , 0, null, null, , , , , null, , , 2023-11-07 02:31:50.279164+00, t, null, 23019866, null).
  File "django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "psycopg/cursor.py", line 737, in execute
    raise ex.with_traceback(None)

IntegrityError: null value in column "date_modified" of relation "search_docket" violates not-null constraint
DETAIL:  Failing row contains (67983939, 2023-11-07 02:31:50.290272+00, null, null, null, null, , WeWork Canada GP ULC, , wework-canada-gp-ulc, 23-19866, 2023-11-06, t, njb, null, null, 8885, , 2023-11-06, null, null, , , , , , 1105231, null, 1, John K. Sherwood, , 0, null, null, , , , , null, , , 2023-11-07 02:31:50.279164+00, t, null, 23019866, null).
(28 additional frame(s) were not displayed)
...
  File "cl/recap/views.py", line 59, in perform_create
    await asyncio.shield(recap_upload_task)
  File "cl/recap/tasks.py", line 114, in process_recap_upload
    docket = await process_recap_docket(pq.pk)
  File "cl/recap/tasks.py", line 555, in process_recap_docket
    await d.asave()
  File "cl/search/models.py", line 778, in save
    super(Docket, self).save(update_fields=update_fields, *args, **kwargs)

I'm not necessarily filing this so somebody can fix it, but I think it's a time-suck and it happens rarely, but perhaps somebody will know offhand or perhaps someday somebody will find solace in not being the only person to have this baffling bug.

ttys0dev commented 11 months ago

IntegrityError: null value in column "date_modified" of relation "search_docket" violates not-null constraint

Not sure where that not-null constraint is coming from, do the params look ok here? Maybe schema not matching what django expects?

mlissner commented 11 months ago

Yeah, that's the field. auto_now means that the Django code should be setting the value whenever the row is modified. I've never heard of it being null by the time it gets to the DB layer and I can't imagine how a query could get past Django without Django grabbing it and adding a date to that column. It's been non-nullable forever.

flooie commented 11 months ago

Just food for thought - it affects auto_now but not auto_now_add and really does seem like a bug in django.

sentry-io[bot] commented 9 months ago

This also seems related to this one:

Sentry issue: COURTLISTENER-5S2

Linked by: @albertisfu

sentry-io[bot] commented 9 months ago

Sentry issue: COURTLISTENER-5RY

Linked by: @albertisfu

grossir commented 7 months ago

@flooie asked to look into this, and there is a situation when auto_now and auto_now_add can fail, even when using Docket(...).save()

From the Django docs on .save()

Field.pre_save() and update_fields If update_fields is passed in, only the pre_save() methods of the update_fields are called. For example, this means that date/time fields with auto_now=True will not be updated unless they are included in the update_fields.

However, this is not our case. From the Sentry events' data, on the save frame cl/search/models.py in save at line 867, we can see that update_fields=None

As Bill also mentions, from the Sentry data we can see that date_created has been auto populated, but date_modified hasn't. Which is unexpected

About this being caused by async functions, it also happens on a court upload form that has some async things going on

https://github.com/freelawproject/courtlistener/blob/594aa7d8c4481e76e40f4b7ca341648f398ebfcb/cl/opinion_page/views.py#L153C1-L158C55

mlissner commented 7 months ago

Summarizing, then: You are also baffled what's going on here, right?

grossir commented 7 months ago

Since this is causing trouble to the tenn court users we could patch it while we request for help in the Django groups. I think this PR should be enough https://github.com/freelawproject/courtlistener/pull/3885

mlissner commented 7 months ago

Closing the loop, Bill posted about this on the Django dev's group: https://groups.google.com/g/django-developers/c/3gcW230jqGY