Closed josef821 closed 3 years ago
If you know that the work is single-column, you can set maxcolseps
to 0
, that speeds up segmentation in my experience.
Why dont you want to train the new segmenter?
De : Konstantin Baierer notifications@github.com Envoyé : jeudi 7 janvier 2021 10:23 À : mittagessen/kraken kraken@noreply.github.com Cc : Subscribed subscribed@noreply.github.com Objet : Re: [mittagessen/kraken] long Segmenting process (#231)
If you know that the work is single-column, you can set maxcolseps to 0, that speeds up segmentation in my experience.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mittagessen/kraken/issues/231#issuecomment-755993852, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJNUE32DITDWJP4KYYIZFB3SYV4SNANCNFSM4VYVWQQA.
Error: no such option: maxcolseps i want to ocr very simple image with two or more line . no column and no image. in 2.x version segmentation was easy. Update : i replace with kraken-3.0b12 it works like old version now.
Error: no such option: maxcolseps
It's an option of the segment CLI, i.e.
kraken -i image.png image.txt binarize segment --maxcolseps 0 ocr -m en_best.mlmodel
i replace with kraken-3.0b12 it works like old version now.
Most recent version is 3.0b18 btw.
Error: no such option: maxcolseps
It's an option of the segment CLI, i.e.
kraken -i image.png image.txt binarize segment --maxcolseps 0 ocr -m en_best.mlmodel
it still wait 30 to 40 second on segmenting. it show this warning after segmention : WARNING:kraken.rpred:Recognizers with segmentation types {'bbox'} will be applied to segmentation of type baselines. This will likely result in severely degraded performace
it still wait 30 to 40 second on segmenting.
I did a stupid in November that's why. kraken is defaulting to the new trainable segmenter and the legacy one couldn't be selected anymore. There's a hotfix in 3.0b19. Although I'm not sure why the new segmenter is slower for you than the old one, generally it is a bit faster if there's a 'normal' amount of lines on the page and you've got sufficient free memory (~4Gb).
some image have 1 and some 10 very simple line. my pc spec: cpu:i7-6700 ram:32G graphic:1070 - 8G up to 16G hdd:500G SSD
I also encountered long segmenting times and that's why I do the segmentatio myself, in opencv.
i compile kraken with anaconda and run this command : kraken -i image.png image.txt binarize segment ocr -m enbest.mlmodel output : [0.0025] Baseline model (/home/mypc/anaconda3/envs/kraken/lib/python3.8/site-packages/kraken/blla.mlmodel) given but legacy segmenter selected. Forcing to -bl. Loading ANN /home/mypc/anaconda3/envs/kraken/lib/python3.8/site-packages/kraken/blla.mlmodel ✓ Loading ANN default ✓ Binarizing ✓ Segmenting _
it will hold almost 30 to 40 second on sementing and ocr will be very slow. what should i do?