mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
750 stars 131 forks source link

Weird polygon shape with shapely>=2.0 #430

Closed colibrisson closed 1 year ago

colibrisson commented 1 year ago

When using lib/segmentation/calculate_polygonal_environment with shapely>=2.0, some of the polygons have a weird shape:

Screenshot from 2023-02-07 11-23-26

As previously reported in https://github.com/mittagessen/kraken/issues/319, there is also a lot of side location conflict exceptions.

Downgrading shapely to 1.7.x solved the issue. Shouldn't shapely be pinned to 1.7?

mittagessen commented 1 year ago

Which kraken version are you using? master shouldn't even work with pre-2.0 versions anymore because of all the API changes.

The issues in #319 are all resolved in master (or should be at least), I just forgot to close the issue.

colibrisson commented 1 year ago

I'm using master.

mittagessen commented 1 year ago

Urrrgh. Can you upload the source image and segmentation somewhere so I can look into it? It's probably something in GEOM, like a change in coordinate order returned by some function or something in that order.

colibrisson commented 1 year ago

@mittagessen I just sent you the dataset.

mittagessen commented 1 year ago

Can you tell me what exactly you're doing? I don't get any shapely warnings nor do the polygons look weird on current master with shapely 2.0.1 (when repolygonizing with segmentation_overlay.py).

mittagessen commented 1 year ago

I had a brain fart and the polygons are indeed broken. But I still don't see any of the warnings. From the looks of it, I assume the region of interest construction is broken somehow. Doing it symbolically with a library like GEOM that changes behavior every second release really screws up things...

mittagessen commented 1 year ago

I just pinned shapely to 1.8.4 because I wasn't able to figure out why exactly the ROI is wrong with 2.x. I still don't see any of the warnings and it was the last bit needed for a new release so it will have to do for now.

bertsky commented 7 months ago

@mittagessen any news on this in 5.0? I can still see Shapely pinned to pre-2...

mittagessen commented 7 months ago

On 24/04/08 08:56AM, Robert Sachunsky wrote:

@mittagessen any news on this in 5.0? I can still see Shapely pinned to pre-2...

It's been low priority for now (and tracking it down is involved) and I wanted to release the changes that accumulated for the last year first.

bertsky commented 7 months ago

Understood.

Note that Shapely 2.0 also makes life somewhat easier with these issues. I started using set_precision to tackle invalid paths after rounding. See here for a complete example. I have also spent a lot of time trying to make polygon handling more robust – let me know if you need help.

mittagessen commented 7 months ago

On 24/04/09 12:33AM, Robert Sachunsky wrote:

Note that Shapely 2.0 also makes life somewhat easier with these issues. I started using set_precision to tackle invalid paths after rounding. See here for a complete example. I have also spent a lot of time trying to make polygon handling more robust – let me know if you need help.

Thanks. There's already quite a bit of code to work around corner cases but the polygonizer runs into quite a few of them so changes in GEOS library behavior can trigger crashes and vastly different output. As this bug report shows.

I'll have a look at your code and probably write a couple of tests for the polygonizer so we can start porting everything onto shapely 2.x.