Open wrznr opened 5 years ago
Pls. note that, according to the PAGE XML specs, skew is to be stored in the orientation attribute of the corresponding region.
@wrznr 7bf45dc8db9405786e64b23873eaf3e554c41ef8 this commit has deskewing orientation angle. Since deskewing is done on page level, the orientation is added as a single TextRegion. Please let us know if it requires any changes.
Many thanks for your efforts!
For the required changes, cf. https://github.com/bertsky/cis-ocrd-py/pull/1 and https://github.com/OCR-D/ocrd_tesserocr/pull/62 for examples how to store deskewing results according to the OCR-D specs.
We performed a first brief testing of the deskewing procedure. There seems to be a misunderstanding regarding the result of the processing step: Deskewing is not supposed to return binarized, deskewed images. It should rather return the identified rotation angle (which should be fairly straightforward since the image is rotated wrt to this angle anyway). Especially, the implied binarization is severe problem for the subsequent processing steps. You may also want to check the OCR-D workflow scheme delivered with the original module project call for a broader picture: http://www.dfg.de/download/pdf/foerderung/programme/lis/170306_ausschreibung_verfahren_volldigitalisierung.pdf (last page)