Deliver cropping results as coordinates in PAGE XML

syedsaqibbukhari / docanalysis

Apache License 2.0

10 stars 5 forks source link

Deliver cropping results as coordinates in PAGE XML #14

Closed wrznr closed 5 years ago

wrznr commented 5 years ago

We performed a first brief testing of the cropping procedure which worked pretty well. However, the output is by now only the cropped image. For the integration with the subsequent processing steps, it is necessary to deliver the cropping results as coordinates in PAGE XML.

<PrintSpace>
    <Coords points="101,232 932,232 932,1794 101,1794"/>
</PrintSpace>

Cf. https://ocr-d.github.io/gt//pagexml_documentation/pagecontent_xsd_Complex_Type_pc_PrintSpaceType.html#PrintSpaceType_Coords

wrznr commented 5 years ago

Pls. note that in the mean time, ocrd_tesserocr implements a very naïve solution for cropping which may nevertheless serve as an orientation for docanalysis. The coordinates are stored in the element Border rather than in PrintSpace since it is more appropriate to reflect the semantics of the cropping operation.

mjenckel commented 5 years ago

Thank you very much. We were looking for some reference for the full PAGE XML.

wrznr commented 5 years ago

Just run the cropping procedure and you should be ready to go.

wrznr commented 5 years ago

In addition, you do not have to write the XML directly. There is a full fledged API to PAGE XML integrated with the OCR-D core module.

n00blet commented 5 years ago

Cropping results are now saved in pageXML 5095877c6566a4ffdbce6ee499405dfb8144c44c

wrznr commented 5 years ago

Given the commit you linked above, I strongly doubt that the cropping results (or the binarization results) are actually saved in PAGE XML. You have to create a PAGE instance and modify the border element. Cf. https://github.com/OCR-D/ocrd_tesserocr/blob/4a69ba1f899659d00dc8bf21756ff72bf63bad60/ocrd_tesserocr/crop.py#L159

n00blet commented 5 years ago

Sorry, that was a wrong commit url. Please check this commit b6c983662dba7ba959b0ab4ed27b8047d70649a3. Cropping results are saved in Page XML and we are using border element save the crop coordinates.