Closed rohanchn closed 3 years ago
kraken, version 3.0.0.0b25
You're most likely using an old recognition model that has been trained on bounding box data. These will produce quite a bit worse results on grayscale, baseline data. Unfortunately, you'll have to train a new model.
Right, the recognition model was prepared on bounding box data, and the page xml I am using for ketos segtrain
do not have Unicode text. I was hoping to segment with a custom model and recognise texts using my existing recognition model in the same pipeline. I trained my existing recognition model on 4.4k lines from varied sources (historical print). Is there a way to reuse gt for this model? The recognition is fairly satisfactory, and I also hope to refine it further. It performs well on single column documents with no illustrations etc.
The page xml that I fed to ketos segtrain
do not have Unicode text.
What would be the best way to train a new recognition model for this?
Should I prepare page xml with Unicode text in eScriptorium and use those files to train for both line segmentation (ketos segtrain
) and text recognization (ketos train
) separately in kraken?
I really appreciate your help.
It doesn't have to be the same data but a recognition model trained on baseline data from escriptorium would most likely be the easiest way. The masking of input data and preprocessing works a bit different between the two formats so the network learns differently (and incompatibly).
Someone was working on a converter for bounding box to baseline format but if I remember correctly nothing has come of it yet.
I can use my recognition model to ocr the pages I have already annotated in eScriptorium, fill in the transcriptions there, and then train in kraken.
Thank you for clarifying.
So, I had a few page xml files (18) with transcription aligned with baseline from eScriptorium, and I tried to train a recognition model using it. These are not a lot of lines, but I wanted to test.
When I used this new baseline based recognition model alongside the segmentation model, I still got the aforementioned warning.
My command is this:
for i in *.png; do kraken -i $i ${i%.png}.txt segment -bl -i wr48_33.mlmodel ocr -m 1new_best.mlmodel; done
I used this to train the text recognizer in kraken:
ketos train -f page -d cuda:0 -o 1new *.xml
I think I am still missing something?
Hi,
I trained a segmentation model using kraken's
ketos segtrain
command with a bunch of page xml files as input that I annotated in eScriptorium. The segmentation model performs well as I can see by using it in eScriptorium to segment scans it hasn't seen before.However, when I try to apply it to ocr scans in kraken using the following command
for i in *.png; do kraken -i $i ${i%.png}.txt segment -i <seg_model> -bl ocr -m <ocr_model>; done
I get the following warning:Loading ANN withregion27_47.mlmodel ✓ Loading ANN default ✓ Segmenting ✓ [13.8289] Recognizers with segmentation types set() will be applied to segmentation of type baselines. This will likely result in severely degraded performace WARNING:kraken.rpred:Recognizers with segmentation types set() will be applied to segmentation of type baselines. This will likely result in severely degraded performace Processing [####################################] 100% Writing recognition results for 112.png ✓
This text recognition is considerably bad as well in comparison to text recognition with default segmentation. I can't rely on the default segmentation either since it is not giving me satisfactory line segmentation.How can I improve this? Could you please help?