When experimenting with OCR-D Workflows for tables I recognized very bad error rates reported by dinglehopper when using the find_table=true option for ocrd-tesserocr-recognize.
The reason was, that dinglehopper did not consider OrderedGroupIndex in the OrderedGroup element when extracting text regions. As a consequence the table regions are not considered for text extraction.
This pull request fixes this by recursively adding text regions in case of OrderedGroupIndex.
When experimenting with OCR-D Workflows for tables I recognized very bad error rates reported by dinglehopper when using the
find_table=true
option for ocrd-tesserocr-recognize.The reason was, that dinglehopper did not consider OrderedGroupIndex in the OrderedGroup element when extracting text regions. As a consequence the table regions are not considered for text extraction.
This pull request fixes this by recursively adding text regions in case of
OrderedGroupIndex
.