OCR-D / ocrd_anybaseocr

DFKI Layout Detection for OCR-D
Apache License 2.0
48 stars 12 forks source link

deskew: respect PAGE coordinate consistency principle #47

Open bertsky opened 4 years ago

bertsky commented 4 years ago

In https://github.com/kba/ocrd_anybaseocr/blob/c65f67e3afc740d70acca18dc3d2c2b778d54d18/ocrd_anybaseocr/cli/ocrd_anybaseocr_deskew.py#L159, the rotation is applied without also enlarging the image respectively. This not only looses information (in the corners), but also violates our consistency principle. Subsequent processors will inevitably plunge into coordinates with some offset.

kba commented 4 years ago

@EEngl52 in #51:

when deskewing with ocrd-anybaseocr-deskew the coordinates from the Alternative Image produced in this process seem to be used for all further steps. In the end, the coordinates of all regions, lines, words etc. are wrong, even though the text is fully detected. kit3.zip

bertsky commented 2 years ago

I suggest we simply drop the ocrd-anybaseocr-deskew processor. It is incorrect and offers no advantage over the earlier and better ocrd-cis-ocropy-deskew.