UB-Mannheim / ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
https://digi.bib.uni-mannheim.de/ocr-fileformat/
MIT License
176 stars 23 forks source link

Pretty print option for CLI #118

Open zuphilip opened 4 years ago

zuphilip commented 4 years ago

Currently, the web GUI is automatically prettied printed the outputs while in the CLI the user has not much clue how to do that. It is written in the help that any Saxon parameter can be passed, but how to pretty print then is still not clear. For me it feels that the used !indent=yes used in the call from the web GUI is hard to find out. Thus, I suggest to introduce a new parameter --pretty in the ocr-transformation which will do that and then also show up in the help of that CLI command.

kba commented 4 years ago

You can append saxon cli args after -- to call to transform, e.g.

ocr-transform alto hocr in.alto out.hocr -- '!indent=yes`

Might be easier to just document that.