meaning of input_binary

bertsky commented 1 year ago

The only documentation for this kwarg is in the standalone CLI:

in general, eynollah uses RGB as input but if the input document is strongly dark, bright or for any other reason you can turn binarized input on. This option does not mean that you have to provide a binary image, otherwise this means that the tool itself will binarized the RGB input document

I find that second sentence very confusing (esp. around otherwise).

So this means that binarization is attempted internally (when activated)? What steps of the pipeline are affected?

(Also, implementation-wise, it looks like binarization is repeated multiple times, without re-using the previous result...)

Can anything be said about how pretrained models would fare when passed (externally) binarized images?

cneud commented 1 year ago

As far as I understand (and please @vahidrezanezhad correct me), Eynollah will almost always produce a better result from a grayscale or color image than from a binarized image.

However, if the input image is "strongly dark or bright" (and this needs a bit more explanation), the user may try to get a better result by setting "input_binary" to true. In this case, Eynollah itself will binarize the image and the user does not have to worry about having to binarize the image with another tool. (Note: I would like to fully integrate sbb_binarization for this)

I find that second sentence very confusing (esp. around otherwise).

Agreed, we will try and reformulate this for better clarity.

What steps of the pipeline are affected?

@vahidrezanezhad should be able to answer this.

it looks like binarization is repeated multiple times, without re-using the previous result

This we will also check wrt to performance.

Can anything be said about how pretrained models would fare when passed (externally) binarized images?

The only thing I can say is that it would be an interesting experiment to evaluate this :) But I am afraid it will require a lot of effort to do this properly (per step, with different binarization methods/models and good metrics for OCR and layout) and only be relevant for few images with bad quality.

bertsky commented 1 year ago

Ok, then (besides reformulation of the description) I highly recommend renaming that option, e.g. apply_binarization: after all, it's not the input that must/can be binary, but the internal step that is performed.

bertsky commented 1 year ago

Integrating sbb_binarization / experimenting with external tools: the OCR-D way would be to just use whatever derived images with binarized in @comments can be found, i.e. whatever binarization has been on the workflow. So whether it is sbb_binarization or any other tool – it would be up to the user to decide and experiment. (But if the internal binarizer here is different than sbb_binarize and perhaps better, then it gets more complicated...)

cneud commented 1 year ago

Let me first confirm the above and then we can rename the option, ideally also consistent for scaling, enhancing, resizing.

vahidrezanezhad commented 1 year ago

As far as I understand (and please @vahidrezanezhad correct me), Eynollah will almost always produce a better result from a grayscale or color image than from a binarized image.

This is exactly the case. Our best performance can be met from a grayscale or color image.

vahidrezanezhad commented 1 year ago

(Also, implementation-wise, it looks like binarization is repeated multiple times, without re-using the previous result...)

I will check it. By the way it should not be implemented multiple times.

vahidrezanezhad commented 1 year ago

Integrating sbb_binarization / experimenting with external tools: the OCR-D way would be to just use whatever derived images with binarized in @comments can be found, i.e. whatever binarization has been on the workflow. So whether it is sbb_binarization or any other tool – it would be up to the user to decide and experiment. (But if the internal binarizer here is different than sbb_binarize and perhaps better, then it gets more complicated...)

The internal binarizer uses the same models as sbb_binarization.

qurator-spk / eynollah

meaning of input_binary #95