freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
354 stars 105 forks source link

Include de_seq_num when available for downloading PACER documents #1088

Closed albertisfu closed 1 month ago

albertisfu commented 1 month ago

A recap.email user reported that a document added by recap.email was marked as sealed, although it is not sealed in PACER.

One condition for marking a document as sealed when adding it from recap.email is if the free-look document download fails when the pacer_magic_num is available.

In this case, the download failed because the URL https://ecf.mdb.uscourts.gov/doc1/0920?caseid=783601 required the de_seq_num and caseid parameters to work correctly, resulting in the required URL: https://ecf.mdb.uscourts.gov/doc1/0920?de_seq_num=57&caseid=783601.

So the solution is to pass the de_seq_num to the FreeOpinionReport download_pdf method when downloading using the pacer_magic_num and possibly for regular downloads as well (not magic number).

mlissner commented 1 month ago

Nice. Should be simple enough. Thanks for the analysis.