OCR-D / ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces
MIT License
39 stars 11 forks source link

[WIP] Add deskewing per tesseract #34

Closed wrznr closed 5 years ago

wrznr commented 5 years ago

With the help of OSD, tesseract determines the skew angle for images. The wrapper applies this to pages and regions. It is not clear yet how to save the estimated skew angle in PAGE XML. Cf. https://github.com/PRImA-Research-Lab/PAGE-XML/issues/9

wrznr commented 5 years ago

@bertsky This means: If I iterate over the blocks and determine their skew, the first block will yield an identical value? I will check this. If it tuns out to be the case, I will restrict the operation on blocks, okay?

bertsky commented 5 years ago

@wrznr Yes exactly, that what I would expect. Yes, RIL.BLOCK should be good (via simple for loop on iterate_level or using Next and IsAtFinalElement, as in recognize.py).

wrznr commented 5 years ago

_process_regions is already there!

bertsky commented 5 years ago

_process_regions is already there!

Right, this may work, too. (I still had on my mind what the API suggests: using the blocks from its PageIterator, not from an external layout analysis. We had a similar situation with text recognition BTW: results may be worse if the engine does not get to see the kinds of layout it was trained on. Maybe one should try to compare both modes empirically, just to be sure how robust this is?)

wrznr commented 5 years ago

@bertsky Is it necessary to update this PR or do your forthcoming additions to ocrd_tesserocr contain this proposed functionalities?

bertsky commented 5 years ago

@wrznr I started out independently, but I could of course rebase on this PR and either add or squash and force-push. It keeps making more sense by the minute :-)

bertsky commented 5 years ago

Sorry, I rebased this on the current master, but was not subsequently allowed to force-push here. So I started a new PR instead (#48), which supersedes this one.