freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
378 stars 111 forks source link

fix(coloctapp): update cleanup_content #1216

Closed grossir closed 1 month ago

grossir commented 1 month ago

Solves: https://github.com/freelawproject/juriscraper/issues/1215

We are getting duplicate hashes again due to some documents having multiple hash altering elements. Generalize cleanup_content to cases with more than one element