Due to its line segmentation, ocropus inserts ocr_line at the wrong position in the flow of elements, i.e. in the middle of another paragraph. From the bounding box it is clear that these should not be at this position.
Can we find some rules for bounding box - reading order dependency to catch such obvious(?) mistakes while still allowing complex layouts?
Due to its line segmentation, ocropus inserts
ocr_line
at the wrong position in the flow of elements, i.e. in the middle of another paragraph. From the bounding box it is clear that these should not be at this position.Can we find some rules for bounding box - reading order dependency to catch such obvious(?) mistakes while still allowing complex layouts?
Related to #23