Closed lizadaly closed 1 year ago
Will want to test a bit on staging before deploying.
I haven't looked closely at any of this, but weighing in anyway. :) Confirming that decision_date
in the CAP API is the date a case was decided, not an update date, and wouldn't make sense as a date check.
The way I'd recommend a downstream database pull updates from CAP is to use last_updated
similar to how browsers use a cache request header: when you run the update script, you can query for something like cases?id__in=1,2,3&last_updated__gte=2023-04-01
to get cases on your list that have changed at all in the last month. (Have not checked API syntax, but something like that.) Then you take that list of potentially changed cases, and check each of them for whether the fields you care about have actually changed from your locally stored version, and if so do whatever it is you do with the updated data.
Thanks, that's helpful. The previous-used code had checks like:
most_recent_doc.effective_date <= most_recent_saved_doc.effective_date
https://github.com/harvard-lil/h2o/blob/develop/web/main/views.py#L2680-L2683
where effective_date
for CAP content is definitely decision_date
, so it does seem there was a time in which effective_date
was a comparable timestamp but hasn't been for awhile.
I made some erroneous assumptions about how the case ingestion code had been working prior to some recent updates and introduced a problem where H2O is now pulling new copies of cases it already has as existing
LegalDocuments
. When I went to patch the logic I realized that status quo had been that cases were always been checked in CAP, but then rarely pulled, because the date field being checked waseffective_date
, which maps todecision_date
, a value I suspect is meant to be immutable? (It's possible this was added to catch cases where the decision date was corrected, but the code wasn't commented either way and there weren't unit tests that touched that logic.)Our conclusion is:
So this PR removes any checks with the upstream provider if the legal document already exists, and if it does, uses our most-recent version.
I'm open to restoring the check on
effective_date
, but I'd feel better about doing that if we could document why it might exist, and if we expect those dates to actually be changed upstream.