Open splet opened 7 years ago
XML files should be accompanied by the corresponding images (to allow for tests related to image coordinates etc).
Will check @StaatsbibliothekBerlin, @impactcentre and @europeananewspapers for reference examples that can be shared.
We started collecting samples of ALTO in different versions for testing ALTO-hOCR conversions at https://github.com/kba/ocr-fileformat-samples/tree/master/samples/alto
Organise creation of reference examples to aid system developers in testing etc. Include artificial examples (such as all possible elements) as well as higher volumes of real world examples (e.g. for performance tests).