OCR-D / zenhub

Repo for developing zenhub integration
Apache License 2.0
0 stars 0 forks source link

Fix failing transformations in ocrd_fileformat #38

Open kba opened 2 years ago

kba commented 2 years ago

Current situation

ocrd_fileformat is a wrapper for ocr-fileformat which bundles schemas and transformation scripts for various OCR file formats.

Some transformations currently fail (ALTO->PAGE), produce duplicates (PAGE->hOCR) or incorrectly order things (PAGE->ALTO)

How it should be

We need to investigate the issues and fix them so that transformations function as expected. Since ocr-fileformat is just a meta-project, these issues need to be fixed at the source (ALTO->PAGE, PAGE->hOCR) or properly configured (PAGE->ALTO)

Testing