Closed wrznr closed 5 years ago
Pls. note that in the mean time, ocrd_tesserocr implements a very naïve solution for cropping which may nevertheless serve as an orientation for docanalysis
. The coordinates are stored in the element Border
rather than in PrintSpace
since it is more appropriate to reflect the semantics of the cropping operation.
Thank you very much. We were looking for some reference for the full PAGE XML.
Just run the cropping procedure and you should be ready to go.
In addition, you do not have to write the XML directly. There is a full fledged API to PAGE XML integrated with the OCR-D core
module.
Cropping results are now saved in pageXML 5095877c6566a4ffdbce6ee499405dfb8144c44c
Given the commit you linked above, I strongly doubt that the cropping results (or the binarization results) are actually saved in PAGE XML. You have to create a PAGE instance and modify the border
element. Cf. https://github.com/OCR-D/ocrd_tesserocr/blob/4a69ba1f899659d00dc8bf21756ff72bf63bad60/ocrd_tesserocr/crop.py#L159
Sorry, that was a wrong commit url. Please check this commit b6c983662dba7ba959b0ab4ed27b8047d70649a3. Cropping results are saved in Page XML and we are using border element save the crop coordinates.
We performed a first brief testing of the cropping procedure which worked pretty well. However, the output is by now only the cropped image. For the integration with the subsequent processing steps, it is necessary to deliver the cropping results as coordinates in PAGE XML.
Cf. https://ocr-d.github.io/gt//pagexml_documentation/pagecontent_xsd_Complex_Type_pc_PrintSpaceType.html#PrintSpaceType_Coords