kba / hocr-spec

The hOCR Embedded OCR Workflow and Output Format
http://kba.github.io/hocr-spec/1.2/
72 stars 20 forks source link

Detail about bbox nesting #110

Open CorentinBrule opened 4 years ago

CorentinBrule commented 4 years ago

I'm developping a tool to view and edit hOCR files and i'm wondering does the spec include rules about bbox nesting ? Does the parent's rectangle have to be exactly the same size as the children's group bouncing box ? (ex : the bbox of a line is the bbox of its words ?)

Other question (other issue ?) : does it compatible with the specification to wrap all space chars in a or something like this to highlight them and to guarantee their resilience when editing the hOCR file (auto beautify for example)