freelawproject / recap

This repository is for filing issues on any RECAP-related effort.
https://free.law/recap/
12 stars 4 forks source link

Attachment pages with lots of big files are not identified as attachment pages #238

Closed mlissner closed 1 year ago

mlissner commented 6 years ago

Here's an example:

https://ecf.nysd.uscourts.gov/doc1/127020983456

At the bottom of this page, it has the following instead of the usual buttons:

screenshot from 2018-02-14 10-11-30

But we use the buttons to detect if it's an attachment page, and thus we don't identify it as an attachment page. Tweaks needed.

Pascal666 commented 6 years ago

Is there a reason to key off the buttons instead of the "Document Selection Menu" text at the top of the page?

johnhawkinson commented 6 years ago

I think it's clear that the current method is wrong, yes.

DC has

            <p>&nbsp;&nbsp;<B>Document Selection Menu</B></p>

and AC has

<tr><th align=center><b>3 Documents are attached to this filing</b><br><br></th></tr>

This doesn't suggest anything in particular that would be common to both, but that's ok.

I guess, in retrospect, I can see how the feature-test design pattern would argue for looking for the buttons, but I think the page heading is the much better choice.

mlissner commented 6 years ago

+1 from me. I think the buttons solved the problem at the time, but their time has run out.

johnhawkinson commented 6 years ago

Actually, why did I say that? Why should we look for headings at all? We should just look for table rows of doc1 links and if they're there than they're close enough to attachment pages to be worth shipping to the server for it to parse. (Because don't have a unified parser in the client and server, and don't want to overparse in the client).

Pascal666 commented 6 years ago

I found where this is and can fix it easily. Can you provide a link to PACER showing the "3 Documents are attached to this filing" style?

johnhawkinson commented 6 years ago

That particular appellate court (AC) example was https://ecf.ca1.uscourts.gov/docs1/00107312789.

mlissner commented 4 years ago

Note that recently somebody identified another case when the buttons weren't there. I think a whole jurisdiction just has them turned off for some reason. Another argument for not relying (solely) on the buttons to trigger an upload.

mlissner commented 1 year ago

I believe this is fixed via https://github.com/freelawproject/recap-chrome/pull/269