Open lmullen opened 3 years ago
A backburner wish to be sure, but I'd like to see how many cites the off-the-rack eyecite finds on the uncorrected MOML corpus. That will then give us a precise number to compare to down the line to say "running our common OCR corrections cleanup generated X thousand more citations" and "modifying eyecite to account for antique reporters generated X thousand more." So,
Tell me if there's a better way to drag my disparate requests to one place, but the next thing I need to track English citations is a general regex output to look for OCR errors: https://github.com/lmullen/legal-modernism/issues/42#issuecomment-967751321
eyecite test.txt