OCR-D / ocrd_anybaseocr

DFKI Layout Detection for OCR-D
Apache License 2.0
48 stars 12 forks source link

Block segmentation produces almost always empty pages #38

Open wrznr opened 4 years ago

wrznr commented 4 years ago

I am running the following workflow on https://digital.slub-dresden.de/werkansicht/dlf/87237/1/(with https://digital.slub-dresden.de/data/kitodo/adrefudio_20253082Z_1907/adrefudio_20253082Z_1907_mets.xml):

  1. Cropping (ocrd-anybaseocr-crop)
  2. Binarization (ocrd-anybaseocr-binarize)
  3. Segmentation (ocrd-anybaseocr-block-segmentation)

For most pages, the block segmentation finds only a few and very often not any blocks. The blocks which are found do not correspond to a comprehensible segmentation. Often it is only the page number or some non-block. Consider for example FILE_0039_BIN-IMG

The only “block” which is found by the block segmentation is: FILE_0039_DFKIBS-IMG,DFKIBS-IMG-IMG_0

I would be very grateful if you could give me some hints how to improve this result. Maybe you could even try to process this book in your own environment to make sure that nothing is amiss with my setup.

mjenckel commented 4 years ago

from your 3 step processing I assume you used the block-segmentation on the binarized images? We noticed ourselves, that the performance becomes considerably worse on binarized images. We will take a look at your particular example. However, since I dont remember seeing very similar samples in the training data, there is also the possibility that the model can not generalize to this type of layout/data and some additional fine-tuning might be necessary.