ocrd_fileformat is a wrapper for ocr-fileformat which bundles schemas and transformation scripts for various OCR file formats.
Some transformations currently fail (ALTO->PAGE), produce duplicates (PAGE->hOCR) or incorrectly order things (PAGE->ALTO)
How it should be
We need to investigate the issues and fix them so that transformations function as expected. Since ocr-fileformat is just a meta-project, these issues need to be fixed at the source (ALTO->PAGE, PAGE->hOCR) or properly configured (PAGE->ALTO)
Current situation
ocrd_fileformat is a wrapper for ocr-fileformat which bundles schemas and transformation scripts for various OCR file formats.
Some transformations currently fail (ALTO->PAGE), produce duplicates (PAGE->hOCR) or incorrectly order things (PAGE->ALTO)
How it should be
We need to investigate the issues and fix them so that transformations function as expected. Since ocr-fileformat is just a meta-project, these issues need to be fixed at the source (ALTO->PAGE, PAGE->hOCR) or properly configured (PAGE->ALTO)
Testing
ocr-transform alto page
ocr-transform page hocr
ocr-transform page alto