ocropus / hocr-tools

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Other
359 stars 78 forks source link

ADD script to create a simplified version of hocr-files #152

Open JKamlah opened 4 years ago

JKamlah commented 4 years ago

A script to create a simplified version of hocr-files. It contains two main functions:

zuphilip commented 4 years ago

One use case is to make the hocr-output of tesseract and ocropy look more equally. Then, in a complex workflow where you used ocropy before, you then can also use tesseract + hocr-simplify instead.