Use of image from --allow-enhancement option in OCR-D workflow

qurator-spk / eynollah

Document Layout Analysis

Apache License 2.0

340 stars 29 forks source link

Use of image from --allow-enhancement option in OCR-D workflow #81

Closed sjscotti closed 1 year ago

sjscotti commented 2 years ago

Hi A question. Is the --allow-enhancement option resultant image intended to be used in an OCR-D workflow? In some cases, I believe resultant image has dimensions that differ from the dimensions of the input image. So if it is intended to be used in OCR-D, what is the suggested workflow (and the commands & parameters) that preserves the overlaying of the ocr results with the image? Thanks!

vahidrezanezhad commented 2 years ago

@sjscotti As long as I know, no. The enhanced imaged in the case of --allow-enhancement can not be applied by OCR-D workflow (@kba ).

kba commented 2 years ago

A question. Is the --allow-enhancement option resultant image intended to be used in an OCR-D workflow?

No, we currently do not use the intermediate results of eynollah, only the resulting PAGE-XML with the assumption that the original image (dimensions) remains unchanged.

sjscotti commented 2 years ago

Do you think that the enhancement approaches used in eynollah should be replicated outside of it to be used in an OCR workflow, or do you think the OCR-D workflow enhancement steps produce a better quality starting image to do OCR on?

vahidrezanezhad commented 2 years ago

Do you think that the enhancement approaches used in eynollah should be replicated outside of it to be used in an OCR workflow, or do you think the OCR-D workflow enhancement steps produce a better quality starting image to do OCR on?

The impact of enhanced image on OCR is not assessed yet. But I think for documents with low quality the enhanced images may improve OCR performance.