freelawproject / recap

This repository is for filing issues on any RECAP-related effort.
https://free.law/recap/
12 stars 4 forks source link

Trouble getting appellate documents into RECAP #340

Closed mlissner closed 1 year ago

mlissner commented 1 year ago

I talked to a user today that reported that they couldn't get documents from this case to go into RECAP:

https://www.courtlistener.com/docket/28735/daniel-mitchell-v-chuck-atkins/

He's using Safari, but I gave it a try in Firefox and failed too. My STR:

  1. Open the link above.
  2. Click the Buy on PACER for doc 5.
  3. Buy it.
  4. Refresh the link above.

I think the reason is that if we both had to log in before buying the document, which must make it lose some piece of metadata?

After logging in, I tried steps 2-4 above, and it worked.

Can we see if this is an issue or if there's a fix?

ERosendo commented 1 year ago

I followed the STR and realized the URL of the document page changes when the user has to log in before buying the pdf and the extension fails to get the document_id from this new URL.

The current implementation of the extension tries to extract the document_id from the servlet attribute in the URL because the document links look like the following when the user is already logged in:

https://ecf.ca9.uscourts.gov/n/beam/servlet/TransportRoom?servlet=ShowDoc/009032127512&caseId=325867

but when the user has to log in the URL looks like this:

https://ecf.ca9.uscourts.gov/n/beam/servlet/TransportRoom?servlet=ShowDoc&pacer=i&caseId=325867&dls_id=009032127512

In the previous URL, the doc_id is not part of the servlet parameter and that's creating the bug but we can still extract it from the dls_id parameter.

We can fix this issue by checking whether the dls_id parameter is available in the URL or not before trying to extract the doc_id.

johnhawkinson commented 1 year ago

Just for the record, it's not clear you checked, but in the case where both dls_id is specified AND there is a pathname appended to the ShowDoc servlet, the dls_id takes precedence. E.g.

009032127512 is doc 2 009032292595 is doc 5

If you construct the URsL with both, e.g.:

https://ecf.ca9.uscourts.gov/n/beam/servlet/TransportRoom?servlet=ShowDoc/009032127512&dls_id=009032292595 https://ecf.ca9.uscourts.gov/n/beam/servlet/TransportRoom?servlet=ShowDoc/009032292595&dls_id=009032127512

it returns the dls_id specified doc.

p.s.: why is one of these DLS ids one digit more than the other? huh.

ERosendo commented 1 year ago

@johnhawkinson Thanks for pointing that out.

mlissner commented 1 year ago

I really appreciate the conversation here, thank you guys. I don't have time to dig into it, but I trust that it's sussing out RECAP/PACER weirdness!