Calamari-OCR / calamari

Line based ATR Engine based on OCRopy
Apache License 2.0
1.04k stars 209 forks source link

Prediction with "fraktur_19th_century/4.ckpt" results in empty text #106

Closed wrznr closed 5 years ago

wrznr commented 5 years ago

Using the following invocation of calamari does not produce any text for the image: region0001_line0002

calamari-predict --checkpoint calamari_models/fraktur_19th_century/4.ckpt --files region0001_line0002.png

Most likely, I am doing something wrong. It would be great if you could help me.

ChWick commented 5 years ago

calamari_models/fraktur_19th_century was trained on binary data (not on gray/colour), please use binarised lines (missing docs for the pretrained model...) and report if the model is working

wrznr commented 5 years ago

Many thanks for your fast replies! Using nlbin-generated region0001_line0002 bin results in

dom kriegeriſchen Berufe zugewendet; 1843 war er Ofſizier geworden, aber auch hiſloriſche und

Any further hints on increasing the text quality?

ChWick commented 5 years ago

I suggest/try the following

wrznr commented 5 years ago

Voting does the trick (should have known from reading your papers):

dem kriegeriſchen Berufe zugewendet; 1843 war er Offizier geworden; aber auch hiſtoriſche und