Open beckstefan opened 4 years ago
AFAICT this processor tries to avoid textual noise via separator line detection. There are a couple of (crappy and badly documented) parameters for this (rular...
), but IMHO your best shot here would be trying to increase the contrast so the binarized image shows a distinct, contiguous vertical line where the gutter/spine is.
Besides binarization settings, there is a second workflow detail that might help: If you deskew before cropping, these lines should be easier to detect.
@beckstefan is this gone with the reimplementation of the cropper?
(If you could post or link to the originals, I could run it...)
A DFG requirement when scanning is to show a part of the opposite page. On some pages this tends to be a problem, since
anybaseocr-crop
does not crop the text and later tools detect text/characters where they shouldn't.Here are two examples.
What would be a strategy to tackle this?