kitab-project-org / explore

KITAB app to explore OpenITI corpus and text resuse data
0 stars 1 forks source link

Add filter for uncorrected OCR #9

Open pverkind opened 5 months ago

pverkind commented 5 months ago

Add a filter for uncorrected OCR. By default, uncorrected OCR should be included in the results ("include OCR").

Each OCR'ed text should get a warning sign in the table and perhaps a different background colour, as do the secondary texts. Mouseover text: "This text was produced using OCR and was not postcorrected manually. The text will include transcription errors (mistranscribed characters, occasional missing lines, lingering footnote text). This may make the text unsuitable for close reading and some computational methods (like search and token-based computational methods)." NB: Add the character error rate if it is found in the Yml file?

To speed up filtering, we might have to create a Boolean field for uncorrected OCR in the database.