UB-Mannheim / ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
https://digi.bib.uni-mannheim.de/ocr-fileformat/
MIT License
176 stars 23 forks source link

Transformation for ImageWare MyBib #139

Closed karkraeg closed 2 years ago

karkraeg commented 2 years ago

I wrote a conversion script to transform ImageWare MyBib eL OCR to ALTO: https://github.com/karkraeg/im2alto

Example for ImageWare OCR: https://library.fes.de/ddb/vs07100/VS07100_1.xml

Please feel free to use the XSL in your Tool!

stweil commented 2 years ago

Thank you! I'm afraid we are currently busy with a lot of other things, so if someone sends a pull request, that would help us.

karkraeg commented 2 years ago

I`d be happy to, but the subdirectory XSLT has it that XSL files are ignored:

grafik

Also we would need a corresponding script in https://github.com/UB-Mannheim/ocr-fileformat/tree/master/script/transform correct?

I'll look into it!