ocropus-archive / DUP-ocropy

Python-based tools for document analysis and OCR
Apache License 2.0
3.42k stars 591 forks source link

Have pageseg output decimal rather than hex numbering #264

Closed nickjwhite closed 6 years ago

nickjwhite commented 6 years ago

This is useful as it means that normal alphabetical ordering of files works as expected. This allows one to do things like cat *txt >wholepage.txt to get a complete transcription of a page.

zuphilip commented 6 years ago

See #184.

nickjwhite commented 6 years ago

See #184.

My apologies, I hadn't seen that pull request, I'll cancel this one, yours does a better job (not least because it handles pseg / ocropus-hocr usage properly too).