UB-Mannheim / ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
https://digi.bib.uni-mannheim.de/ocr-fileformat/
MIT License
176 stars 23 forks source link

gcv__page: use -source-json instead of -source-xml #156

Closed bertsky closed 1 year ago

stweil commented 1 year ago

I'd like to run a test before merging and ideally there should also be a unit test. Can we get a (small) GCV file for such tests?

stweil commented 1 year ago

A test is still missing. I decided to pull the commit nevertheless because I expect that it improves the situation.