freelawproject / foresight

Where we discuss and prioritize new features
2 stars 1 forks source link

Docket entry classifier #75

Open anseljh opened 4 days ago

anseljh commented 4 days ago

Headline

FLP's RECAP Archive Categorizes Docket Entries to Help You Find the Needle in the Haystack

What is the Feature?

Build classifiers for docket entry text to label them as, e.g., Complaint, Answer, Pleading (a superset encompassing both Complaint and Answer), Motion, Memorandum, Order, Judgment, Motion for Summary Judgment (a subset of Motion), Claim Construction Order, etc.

These can then become:

What Problem Might it Solve?

A lot of dockets are hella long, hundreds or thousands of entries across many pages. This stinks if you're only interested in a subset of the documents. Today, you have to either read a zillion docket entries, or ctrl-F your way through each page—and that may not work because of variation across courts. If we classified docket entries, then users could filter on our labels instead, eliminating the inaccurate text searches.

Describe a Scenario in Which the Feature Might be Used

As a lawyer, I'm working on a summary judgment motion in a patent case. I know my opposing party has been in some other cases, and I want to see what arguments they raised in cases that reached the summary judgment phase. I can find the cases, but those dockets are crazy long.

Enter docket entry classification! Now, I can filter by label and easily get to all the summary judgment motions and orders.

Technical Requirements

Existing Systems or Alternatives?

Back in the day, I did this with a rules engine that evaluated zillions of regexes for proto-Lex Machina. It was hard because there was so much variation in wording across courts, which made it complex and brittle. However, at least while I was there, it outperformed what PhD students were able to do with ML. But that was a long time ago, and a lot has changed!

Any Additional Information?

This is also important because it's an enabler for other things: