ocropus / hocr-tools

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Other
371 stars 79 forks source link

ADD script to create a simplified version of hocr-files #152

Open JKamlah opened 5 years ago

JKamlah commented 5 years ago

A script to create a simplified version of hocr-files. It contains two main functions:

zuphilip commented 5 years ago

One use case is to make the hocr-output of tesseract and ocropy look more equally. Then, in a complex workflow where you used ocropy before, you then can also use tesseract + hocr-simplify instead.