gsireesh / ht-max

Code for the HT-MAX project
Apache License 2.0
0 stars 1 forks source link

Grobid Parser Boxes Overlap #26

Open gsireesh opened 7 months ago

gsireesh commented 7 months ago

Currently, the reading order parser merges overlapping boxes, but implicitly assumes that only two boxes can be overlapping, rather than potential clusters of boxes. This needs to be addressed to merge clusters of boxes if necessary, but we also need to check how we're ending up with so many clusters of boxes.