freelawproject / recap

This repository is for filing issues on any RECAP-related effort.
https://free.law/recap/
12 stars 4 forks source link

PACER Appendixes getting uploaded as zips (but they're PDFs) #290

Closed sentry-io[bot] closed 4 years ago

sentry-io[bot] commented 4 years ago

So there's a feature in PACER that I've never seen before. When you're loading a docket report, you can ask for it to be an "Appendix" instead of the usual thing. That gives you an interface where you can select docket entries, and then download them all as part of a single PDF. It's kind of neat. The resulting PDF has the full docket report followed by the documents selected (including their attachments). Unfortunately, when you use this feature, it generates a bit PDF, but our zip file code runs and gives it a name of a zip. That's confusing to various programs (it's a .zip), but beyond that, we also upload it as a zip.

The simple thing here is: Don't upload appendix files when they're generated.

You can see the Sentry issue that caught this below. I'm not sure which jurisdictions have this feature, but you can definitely see it for 1:18-cv-00427-JJM-LDA Citibank, N.A. as Trustee v. Caito in RI.

Be careful, these can be extremely expensive.

Sentry Issue: COURTLISTENER-8P

BadZipfile: File is not a zip file
  File "cl/recap/tasks.py", line 356, in process_recap_zip
    with ZipFile(pq.filepath_local.path, "r") as archive:
  File "zipfile.py", line 793, in __init__
    self._RealGetContents()
  File "zipfile.py", line 834, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
mlissner commented 4 years ago

Alternately, I could add another ~object type~ upload_type for these and we can upload them anyway. We don't have to process them and can just store them until we have enough that it matters. They've got a lot of information.

litewarp commented 4 years ago

Up to you. I'll write a check to make sure the zip functionality doesn't get loaded on those pages. We can always revert if you decide to store them.

mlissner commented 4 years ago

Yeah, I think that's best. Keep it simple for now. We've only got 1 of these appendixes since we turned on zip file. (Sentry tells me how many times something has happened.)

mlissner commented 4 years ago

This was fixed in freelawproject/recap-chrome release 1.2.26. Thank you Nick!