qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
345 stars 29 forks source link

PAGE-XML coordinates can have self-intersections #20

Closed bertsky closed 1 year ago

bertsky commented 3 years ago

On this image, eynollah produces polygons that are invalid:

ERROR processor.ExtractPages - Page "PHYS_0002" ImageRegion "r91" Self-intersection[2151 3197]
ERROR processor.ExtractPages - Page "PHYS_0002" ImageRegion "r92" Self-intersection[1605 99]
The incriminated data is here

```XML ```

cneud commented 2 years ago

Could you please check if this is still the case with the example image and the current version @vahidrezanezhad ?

vahidrezanezhad commented 1 year ago

On this image, eynollah produces polygons that are invalid:

ERROR processor.ExtractPages - Page "PHYS_0002" ImageRegion "r91" Self-intersection[2151 3197]
ERROR processor.ExtractPages - Page "PHYS_0002" ImageRegion "r92" Self-intersection[1605 99]

The incriminated data is here

I couldnt reproduce this error anymore. I tried for both main regions and -fl option and In either case the highest region number was r90 and no error occurred. I will appreciate if you check it again dear @bertsky

bertsky commented 1 year ago

Indeed, I cannot reproduce myself anymore. I believe the last version was 0.0.11, so I assume this has been fixed at some point.