qurator-spk / sbb_binarization

Document Image Binarization
Apache License 2.0
72 stars 14 forks source link

quality in very low contrast regime #38

Open bertsky opened 2 years ago

bertsky commented 2 years ago

I have material with typewritten forms that is very challenging (to any binarization method), because the typewriter sometimes fades out, while the printing ink near it blasts in a dark black. The scan/photography also seems to cause a non-normalized histogram:

So it seems that the autoencoder gets confused by the normalized image, but benefits from making the image even darker. May that be a general tendency (as in: if you loose fg, make it darker, and conversely if you get bg, make it brighter)? Can we derive any metrics that might hint at quality problems from the intermediate activation between encoder and decoder? Any recommendations/considerations?