Princeton-CDH / ppa-nlp

Discovering patterns in poetry’s data with machine learning; software for use with Princeton Prosody Archive (PPA) full-text corpus
1 stars 0 forks source link

As an NLP expert, I want to assess the OCR quality of the pages in the test set so that I can offer a data-based recommendation on whether to re-OCR certain volumes. #21

Open mnaydan opened 5 months ago

mnaydan commented 5 months ago

Outcome is a rough estimate/percentage of quality for Gale texts and HathiTrust texts.