A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
I have confirmed that the PDF does not exist in S3.
Therefore, there are two possibilities:
The file was somehow not stored in S3.
The recap.email task failed after the cleanup of the PQ in mark_pq_successful, possibly during the merging of attachments. Thus, the main PDF was removed from S3, and the PQ was not updated because it's within a transaction.
I believe the second option is more likely what happened. The solution seems easy, we should move mark_pq_successful to after merging attachments. This way, in case of a failure in that process, the main PDF is retained and available for during a retry.
ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
(1 additional frame(s) were not displayed)
...
File "storages/backends/s3.py", line 132, in __init__
self.obj.load(**params)
File "boto3/resources/factory.py", line 565, in do_action
response = action(self, *args, **kwargs)
File "boto3/resources/action.py", line 88, in __call__
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "botocore/client.py", line 553, in _api_call
return self._make_api_call(operation_name, kwargs)
File "botocore/client.py", line 1009, in _make_api_call
raise error_class(parsed_response, operation_name)
FileNotFoundError: File does not exist: recap_processing_queue/2024/03/04/4a1150901df84f11b1823ed6639e01ee.pdf
(2 additional frame(s) were not displayed)
...
File "cl/recap/tasks.py", line 2347, in process_recap_email
save_pacer_doc_from_pq(self, rd, fq, pq, magic_number)
File "cl/recap/tasks.py", line 1923, in save_pacer_doc_from_pq
with pq.filepath_local.open(mode="rb") as local_path:
This has occurred once so far. The stack trace suggests that retrieving the temporarily stored PDF from the PQ failed due to a
FileNotFoundError
.The related PQ is: https://www.courtlistener.com/admin/recap/processingqueue/12743192/change/
I have confirmed that the PDF does not exist in S3.
Therefore, there are two possibilities:
mark_pq_successful
, possibly during the merging of attachments. Thus, the main PDF was removed from S3, and the PQ was not updated because it's within a transaction.I believe the second option is more likely what happened. The solution seems easy, we should move
mark_pq_successful
to after merging attachments. This way, in case of a failure in that process, the main PDF is retained and available for during a retry.Sentry Issue: COURTLISTENER-6VD
Filed by: @albertisfu