issues
search
freelawproject
/
juriscraper
An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
341
stars
98
forks
source link
[WIP] #197 SCOTUS scraper
#975
Open
ralexx
opened
3 months ago
ralexx
commented
3 months ago
From #197 .
Notable
Adds
pymupdf
as a dependency,
as suggested
.
Includes log-then-raise handling for two characteristic host responses ('Access Denied' page and server name resolution error) that appear to be anti-abuse measures.
Because of the above, only single-threaded downloading was implemented here.
TODO
[ ] Docket parser mapping to database structure: maintainer input needed
[ ] Python 3.8 support if necessary (I used some 3.9+ syntax)
[ ] Tests for docket parser
[ ] Tests currently omitted because they would need additional mocking
[ ] Better handling of the two host responses mentioned above
[ ] Multithreaded downloading, if desired (implemented locally but not included in this draft)
CLAassistant
commented
3 months ago
All committers have signed the CLA.
From #197 .
Notable
pymupdf
as a dependency, as suggested.TODO