Closed kba closed 4 years ago
Can we take https://github.com/PRImA-Research-Lab/PAGE-XML/blob/master/documentation/example/SimplePage.xml as an example mentioning in the website? Is that working fine, or better to take another one?
Travis still broken, but the setup is right. Can somebody debug this?
Processing triggers for libc-bin (2.19-0ubuntu6.13) ...
tesseract -l Fraktur wetzel_reisebegleiter_1901_0021_800px.jpg stdout hocr | xmllint --format - > wetzel_reisebegleiter_1901_0021.hocr
../bin/ocr-transform.sh hocr alto2.0 wetzel_reisebegleiter_1901_0021.hocr | xmllint --format - > wetzel_reisebegleiter_1901_0021.alto
Stylesheet file /home/travis/build/UB-Mannheim/ocr-fileformat/xslt/hocr__alto2.0.xsl does not exist
-:1: parser error : Document is empty
make: *** [wetzel_reisebegleiter_1901_0021.alto] Error 1
The command "cd example && make deps roundtrip diff" exited with 2.
https://travis-ci.org/UB-Mannheim/ocr-fileformat/builds/626339369
The error is completely unrelated to your changes here. But there were some changes in hocr-to-ALTO today which renamed some of the scripts: https://github.com/filak/hOCR-to-ALTO/commit/5122b72ed1c6c9a6a5582a0554e45ddc658b68df . We use the most current version (master brancht) of that repo and therefore Travis is complaining.
I am not sure whether this is the most elegant fix, but Travis is now happy again.
Looks good, thanks for fixing @zuphilip
We could add another symlink page__page2019 to upgrade PAGE files, otherwise I think this is ready to merge.
LGTM. @kba Let me know when this is ready from your side.
We could add another symlink page__page2019
Done. I think this can be merged. :shipit:
LGTM. @kba Let me know when this is ready from your side.
@kba, is it ready now?
Thank you, @kba.
Thank you very much @kba !
Integrates https://github.com/PRImA-Research-Lab/prima-page-converter. Currently supports ALTO -> PAGE conversion but could be extended (also accepts Google Cloud Vision, hocr, older PAGE versions and FRXML).
@wrznr @maxnth @chreul