Closed johnlockejrr closed 4 months ago
looks like shapely
the polygon is too big and the recognizer wasn't trained on lines where the letters are only a quarter of the line height.
Here is on kraken 4.x, same model.
The models I used in this test: mcdonald.zip
It isn't a model issue but the polygonization is wrong. I'll have a look. The rotation code changed between 4.x and 5.x so it's either that or other shapely shenanigans.
Could you also send me the image file and any ALTO/PageXML you've got? It's difficult to debug without being able to run a test case.
export_doc23_memar_marqah_mcdonald_alto_202405131147.zip Sure, here it is the image with ALTO (from 4.x)
Thanks. It's mostly so I can make sure the baselines are identical.
Any update on this matter?
Apparently, the error persists on some other image data.
Nope, not true after all. Just crappy output of the polygonizer.
I'm not sure if is eScriptorium or kraken related, I just want to poin out, same model, same image on different installs:
Both segmentation and recognition models were trained on kraken 5.2.4