qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
332 stars 27 forks source link

🧹 Don't produce spurious TextEquiv elements. #68

Closed mikegerber closed 2 years ago

mikegerber commented 2 years ago

eynollah produces spurious - and empy - pcGts TextEquiv elements. This is a. unnecessary, b. wrong and c. produces a lot of warning messages in subsequent OCR processing steps because the OCR processor warns about already existing text.

Fix this by not generating any TextEquiv elements.

Fixes gh-37.