Closed ghost closed 5 years ago
The real question is does the new version really need clstm for script detection? Can't it be done with pytorch?
OK, some clarification on the new backend and how models work now. Technically the new backend can run both clstm and ocropy (in pronn format) models and this works transparently. There is an issue caused by slight numerical differences which in my tests resulted in a ~5% drop of accuracy that I haven't been able to pin down (haven't been looking closely to be honest) but apart from that it works.
Training a new script recognition model requires a bit of validation as the new default architecture has in spite of higher label accuracy a lower resolution in the time axis. It's on my near term (<2 week) ToDo list. I don't consider using the old model with the new backend appropriate.
Update:
Using the latest master
release build by environment.yml
, getting the same error.
So I renamed script.clstm
file to script.mlmodel
.
Then I train my own model, when used getting no recognition, due to segmentation and script detection. Notice the 0 lines
detection:
(kraken) home@home-lnx:~/Desktop/kraken-master$ kraken -i 000049.png out.txt -v binarize segment ocr -m model_5.mlmodel
[0.3702] Loading model from /home/home/Desktop/kraken-master/model_5.mlmodel
[0.6518] Binarizing 000049.png
[0.7994] Segmenting /tmp/tmp0d2ixmfw
[0.8125] Compute reading order on 0 lines in lr direction
[0.8126] Perform topological sort on partially ordered lines
[0.8128] Detecting scripts with /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel in 0 lines on /tmp/tmp0d2ixmfw
[0.8129] Loading model from /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel
[0.8313] Running recognizer on /tmp/tmp0d2ixmfw with 0 lines
[0.8345] Running 1 multi-script recognizers on /tmp/tmp0d2ixmfw with 0 lines
[0.8351] Serializing as text into out.txt
When using the model on another image, getting the error below:
(kraken) home@home-lnx:~/Desktop/kraken-master$ kraken -i 000002.png out.txt -v binarize segment ocr -m model_4.mlmodel
[0.3709] Loading model from /home/home/Desktop/kraken-master/model_4.mlmodel
[0.6514] Binarizing 000002.png
[0.7867] Segmenting /tmp/tmpuql8u9_r
[0.8029] Compute reading order on 1 lines in lr direction
[0.8030] Perform topological sort on partially ordered lines
[0.8033] Detecting scripts with /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel in 1 lines on /tmp/tmpuql8u9_r
[0.8034] Loading model from /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel
[0.8216] Running recognizer on /tmp/tmpuql8u9_r with 1 lines
[0.8490] Running 1 multi-script recognizers on /tmp/tmpuql8u9_r with 1 lines
Traceback (most recent call last):
File "/home/home/anaconda3/envs/kraken/bin/kraken", line 11, in <module>
sys.exit(cli())
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 1164, in invoke
return _process_result(rv)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 1102, in _process_result
**ctx.params)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/kraken.py", line 220, in process_pipeline
task(base_image=base_image, input=input, output=output)
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/kraken.py", line 157, in recognizer
for pred in bar:
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/_termui_impl.py", line 282, in generator
for rv in self.iter:
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/rpred.py", line 182, in mm_rpred
miss = [x[0] for x in bounds['boxes'] if not nets.get(x[0])]
File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/rpred.py", line 182, in <listcomp>
miss = [x[0] for x in bounds['boxes'] if not nets.get(x[0])]
TypeError: unhashable type: 'list'
Test files are attached: testfiles.zip
Referencing issue #95
Waiting for this issue to be fixed.
I started the training for a script detection model with the new architecture. It will take a few hours but it should solve this issue.
Has the issue been solved?
@Usamamalik007 can't you read? He just told you that he is working on it, here you go and look again, you just might miss it.
OK, the issue requires some further development. I've disabled the script detection per default for now as the new CTC loss produces more highly peaked activations in the softmax layer that are located within the pertinent graphemes but aren't wide enough to cover it completely. If you don't want to upgrade just run segment -n
for now.
We've got some work on multilingual dictionaries in the near future that will require it to work again....
Kraken fails when running any command that requires
Segmenting
, the reson is that thescript.mlmodel
file that detects scripts in a segmented page is not found in the kraken repo.The solution is renaming
script.clstm
toscript.mlmodel
, and the problem would be solved.or you can simple disable script detection using
-n
or–no-script-detect
.@mittagessen Please rename
script.clstm
toscript.mlmodel
in the repo.Below is how to reproduce the same problem I was facing.
or
The error shown in terminal: