freelawproject / recap

This repository is for filing issues on any RECAP-related effort.
https://free.law/recap/
12 stars 4 forks source link

Possible UI changes in light of RSS scraper #250

Open johnhawkinson opened 6 years ago

johnhawkinson commented 6 years ago

The recently deployed RSS scraper has changed some things, and it's worth revisiting some of hte UI choices because of different assumptions.

For instance, previously, it tended to be the case that a docket was mostly up-to-date as of its last update date. That is, if RECAP users regularly ran docket reports, even limited ones, they would tend to run them where they last left off, and holes were not common. Nothing guaranteed this, of course, but it was generally the case.

Now holes are common, especially in districts where RSS feeds skip many kinds of entries, so this kind of thing is quite common (from Ragbir v. United States in ecf.njd), with docket entries running 35, 36, 40, 47, and 52 and so skipping 37–39, 41–46, and 48–51:

screen shot 2018-05-21 at 08 27 20

This suggests that CL should have a way to visually present holes in docket numbering with a little more clarity, because they are much more common now than they used to be. (cf, also, the general design improvements issue, https://github.com/freelawproject/recap/issues/195).

That is, some kind of highlighting of discontinuities between number sequences.


Another consequence of RSS scraping is that a certain category of RECAP users will be less likely to run docket reports than before. If you are privileged enough to receive NEFs in cases you care about (but don't get free looks) or you found out about case activity in another way (RSS monitoring; checking the PACER last update date, etc.), in the past, it was necessary to run a docket report to get such items visible to RECAP (because of #61). Now, at least where the RSS feed includes the document type in question and the user waits an hour (or whatever) to download the document, running the docket report is no longer necessary to get the item in RECAP.

And in the NEF case, the user already has the docket text from the NEF, so there's little incentive to run the docket report.

This is not actionable, but it's worth noting an expected user behavior change.


A related aspect is that, for those users who indeed want to keep the RECAP docket up to date, a reasonable technique was to go to the Docket Report page and check the last update:

screen shot 2018-05-21 at 08 37 16

…and then punch that date into the first Filed Date field and update the docket from that point forward. If every user followed the same (or a compatible) algorithm, then the docket is up to date with no holes.

But now the last-update shown on the Docket Report query page is no longer the last time someone ran a docket report and uploaded it to RECAP. Instead it's the last time the RSS scraper found something.

This is … annoying.

One solution would be for CL to track the RSS scraper update date and the most-recent effective docket upload dates separately. Then legacy RECAP clients could only display the latter, and newer clients could display both.


Are there other UI changes to FLP/CL/RECAP ecosystem that the RSS scraper suggests or implies?

johnhawkinson commented 6 years ago

One solution would be for CL to track the RSS scraper update date and the most-recent effective docket upload dates separately. Then legacy RECAP clients could only display the latter, and newer clients could display both.

The Written Opinion Report scraper also implicates this.

mlissner commented 6 years ago

That is, some kind of highlighting of discontinuities between number sequences.

This is actually a feature the old, original recap archive had. I think it did it as some kind of accordion thing between the docket entry rows, and in the middle of the accordion, it showed some text that said, something like, "Missing xx Docket Entries."

I've wanted to recreate this for some time, but I think it makes life a lot harder. Instead of just passing a list of docket entries to the HTML template from the Python code, you have to pass...what? A list of missing stuff also? A modified list where missing entries have a special flag? I'm not sure what the data objects would look like for this, but I suspect the answer lies therein.


One solution would be for CL to track the RSS scraper update date and the most-recent effective docket upload dates separately.

I guess that's doable. Adds more complication, but it's not impossible. Seems like the obvious solution, though I don't love that you might see different last update values depending on where you're looking (the docket sheet vs PACER). Hm.

johnhawkinson commented 6 years ago

This is actually a feature the old, original recap archive had. I think it did it as some kind of accordion thing between the docket entry rows, and in the middle of the accordion, it showed some text that said, something like, "Missing xx Docket Entries."

Weird. I used it a lot and have no memory of this!

Instead of just passing a list of docket entries to the HTML template from the Python code

Huh. The number of things around here that are constrained by stupid Django limitations seems unreasonably high. I'll take a look at some point…

I don't love that you might see different last update values depending on where you're looking (the docket sheet vs PACER). Hm.

I don't see why you would. Both can show both times if you want, or consistent times. The only people who lose are the ones with old extension versions, and basically those people are written off anyhow…

mlissner commented 6 years ago

The number of things around here that are constrained by stupid Django limitations seems unreasonably high.

I'm not sure it's a Django thing, necessarily. It's more like, "What does this object look like?"

Is it a list like:

[
  {docket_entry_number: 1, date: x, description, y},
  {docket_entry_number: 2, date: x, description, y},
  {docket_entry_number: 3, missing: True},
  {docket_entry_number: 4, missing: True},
  {docket_entry_number: 5, date: x, description, y},
]

Or what? At some point, we'll iterate over that list and need to decide whether to print an accordion or a docket entry row. Figuring out how to do that from the above seems...annoying, but doable. Must be better ways to do it.