openpaperwork / pyocr

A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab
https://gitlab.gnome.org/World/OpenPaperwork/pyocr
930 stars 152 forks source link

hOCR : too much data is stripped #12

Open jflesch opened 10 years ago

jflesch commented 10 years ago

Even when using the LineBoxBuilder, it seems too much data is stripped from the hOCR files.