freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
378 stars 111 forks source link

fix(uscfc): implement new site #1224

Closed grossir closed 2 weeks ago

grossir commented 1 month ago

Solves #1221

flooie commented 1 month ago
Adding new item:
    case_dates: "2023-12-06"
    case_names: "Harlow v. Secretary of Health and Human Services"
    download_urls: "https://ecf.cofc.uscourts.gov/cgi-bin/show_public_doc?2020vv0550-76-0"
    precedential_statuses: "Unpublished"
    blocked_statuses: "False"
    date_filed_is_approximate: "False"
    docket_numbers: "20-550V"
    judges: "C. Moran"
    summaries: "PUBLIC DECISION (Originally filed: 11/13/2023) regarding [75] DECISION of Special Master Signed by Special Master Christian J. Moran. (dksc) Service on parties made."
    case_name_shorts: "Harlow"

I think you need to parse out the vaccine stuff separately from the rest of the federal claims opinions.

This case for example is a published decision.

grossir commented 1 month ago

@flooie I addressed your comments, please check again


To get the proper status for vaccine opinions I had to implement extract_from_text, since the site marks them all as "unreported". I found some older ones marked as Unpublished that should be Published; so even when the site marked the status it was sometimes wrong

https://www.courtlistener.com/opinion/4896639/trigueros-v-secretary-of-health-and-human-services/?q=court_id%3Auscfc+%22published+decision%22&type=o&order_by=score+desc&stat_Unpublished=on

https://www.courtlistener.com/opinion/4753749/eamick-v-secretary-of-health-and-human-services/?q=court_id%3Auscfc+%22published+decision%22&type=o&order_by=score+desc&stat_Unpublished=on&page=6


About the summary, it has varying amount of information. Once the judges are removed, it reads better.

I also picked up the (originally filed... ) and put it into case["other_dates"], and it improves readibility of the "summary"

About it holding info for uscfc_vaccine, it tells in very few words what the document is about. I think it would be worth to keep. For example: