mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
720 stars 130 forks source link

Failing at the Segmenting step [Rename script.clstm to script.mlmodel] #84

Closed ghost closed 5 years ago

ghost commented 6 years ago

Kraken fails when running any command that requires Segmenting, the reson is that the script.mlmodel file that detects scripts in a segmented page is not found in the kraken repo.

The solution is renaming script.clstm to script.mlmodel, and the problem would be solved.

cd /usr/local/lib/python3.6/dist-packages/kraken/
sudo cp script.clstm script.mlmodel

or you can simple disable script detection using -n or –no-script-detect.

@mittagessen Please rename script.clstm to script.mlmodel in the repo.

Below is how to reproduce the same problem I was facing.

kraken -i 000004.png out.txt binarize segment ocr -m model_13.mlmodel

or

kraken -i 1.png lines.json segment

The error shown in terminal:

home@home-lnx:~/Desktop/test/training_data$ kraken -i 000004.png out.txt binarize segment ocr -m model_13.mlmodel 
Loading RNN default ✓
Binarizing  ✓
Segmenting  ✗
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/models.py", line 137, in load_any
    nn = TorchVGSLModel.load_model(str(fname))
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/vgsl.py", line 418, in load_model
    mlmodel = MLModel(path)
  File "/usr/local/lib/python3.6/dist-packages/coremltools/models/model.py", line 147, in __init__
    self._spec = _load_spec(model)
  File "/usr/local/lib/python3.6/dist-packages/coremltools/models/utils.py", line 93, in load_spec
    with open(filename, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.6/dist-packages/kraken/script.mlmodel'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/models.py", line 141, in load_any
    nn = TorchVGSLModel.load_clstm_model(fname)
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/vgsl.py", line 341, in load_clstm_model
    with open(path, 'rb') as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.6/dist-packages/kraken/script.mlmodel'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/kraken", line 10, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1093, in invoke
    return _process_result(rv)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1031, in _process_result
    **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/kraken/kraken.py", line 185, in process_pipeline
    task(base_image=base_image, input=input, output=output)
  File "/usr/local/lib/python3.6/dist-packages/kraken/kraken.py", line 82, in segmenter
    res = pageseg.detect_scripts(im, res, valid_scripts=allowed_scripts)
  File "/usr/local/lib/python3.6/dist-packages/kraken/pageseg.py", line 500, in detect_scripts
    rnn = models.load_any(model)
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/models.py", line 144, in load_any
    nn = TorchVGSLModel.load_pronn_model(fname)
  File "/usr/local/lib/python3.6/dist-packages/kraken/lib/vgsl.py", line 279, in load_pronn_model
    with open(path, 'rb') as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.6/dist-packages/kraken/script.mlmodel'
home@home-lnx:~/Desktop/test/training_data$ 
amitdo commented 6 years ago

The real question is does the new version really need clstm for script detection? Can't it be done with pytorch?

mittagessen commented 6 years ago

OK, some clarification on the new backend and how models work now. Technically the new backend can run both clstm and ocropy (in pronn format) models and this works transparently. There is an issue caused by slight numerical differences which in my tests resulted in a ~5% drop of accuracy that I haven't been able to pin down (haven't been looking closely to be honest) but apart from that it works.

Training a new script recognition model requires a bit of validation as the new default architecture has in spite of higher label accuracy a lower resolution in the time axis. It's on my near term (<2 week) ToDo list. I don't consider using the old model with the new backend appropriate.

ghost commented 5 years ago

Update: Using the latest master release build by environment.yml, getting the same error. So I renamed script.clstm file to script.mlmodel.

Then I train my own model, when used getting no recognition, due to segmentation and script detection. Notice the 0 lines detection:

(kraken) home@home-lnx:~/Desktop/kraken-master$ kraken -i 000049.png out.txt -v binarize segment ocr -m model_5.mlmodel 
[0.3702] Loading model from /home/home/Desktop/kraken-master/model_5.mlmodel 
[0.6518] Binarizing 000049.png 
[0.7994] Segmenting /tmp/tmp0d2ixmfw 
[0.8125] Compute reading order on 0 lines in lr direction 
[0.8126] Perform topological sort on partially ordered lines 
[0.8128] Detecting scripts with /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel in 0 lines on /tmp/tmp0d2ixmfw 
[0.8129] Loading model from /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel 
[0.8313] Running recognizer on /tmp/tmp0d2ixmfw with 0 lines 
[0.8345] Running 1 multi-script recognizers on /tmp/tmp0d2ixmfw with 0 lines 
[0.8351] Serializing as text into out.txt 

When using the model on another image, getting the error below:

(kraken) home@home-lnx:~/Desktop/kraken-master$ kraken -i 000002.png out.txt -v binarize segment ocr -m model_4.mlmodel 
[0.3709] Loading model from /home/home/Desktop/kraken-master/model_4.mlmodel 
[0.6514] Binarizing 000002.png 
[0.7867] Segmenting /tmp/tmpuql8u9_r 
[0.8029] Compute reading order on 1 lines in lr direction 
[0.8030] Perform topological sort on partially ordered lines 
[0.8033] Detecting scripts with /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel in 1 lines on /tmp/tmpuql8u9_r 
[0.8034] Loading model from /home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/script.mlmodel 
[0.8216] Running recognizer on /tmp/tmpuql8u9_r with 1 lines 
[0.8490] Running 1 multi-script recognizers on /tmp/tmpuql8u9_r with 1 lines 
Traceback (most recent call last):
  File "/home/home/anaconda3/envs/kraken/bin/kraken", line 11, in <module>
    sys.exit(cli())
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 1164, in invoke
    return _process_result(rv)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 1102, in _process_result
    **ctx.params)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/kraken.py", line 220, in process_pipeline
    task(base_image=base_image, input=input, output=output)
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/kraken.py", line 157, in recognizer
    for pred in bar:
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/click/_termui_impl.py", line 282, in generator
    for rv in self.iter:
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/rpred.py", line 182, in mm_rpred
    miss = [x[0] for x in bounds['boxes'] if not nets.get(x[0])]
  File "/home/home/anaconda3/envs/kraken/lib/python3.7/site-packages/kraken/rpred.py", line 182, in <listcomp>
    miss = [x[0] for x in bounds['boxes'] if not nets.get(x[0])]
TypeError: unhashable type: 'list'

Test files are attached: testfiles.zip

ghost commented 5 years ago

Referencing issue #95

Usamamalik007 commented 5 years ago

Waiting for this issue to be fixed.

mittagessen commented 5 years ago

I started the training for a script detection model with the new architecture. It will take a few hours but it should solve this issue.

Usamamalik007 commented 5 years ago

Has the issue been solved?

ghost commented 5 years ago

@Usamamalik007 can't you read? He just told you that he is working on it, here you go and look again, you just might miss it.

mittagessen commented 5 years ago

OK, the issue requires some further development. I've disabled the script detection per default for now as the new CTC loss produces more highly peaked activations in the softmax layer that are located within the pertinent graphemes but aren't wide enough to cover it completely. If you don't want to upgrade just run segment -n for now.

We've got some work on multilingual dictionaries in the near future that will require it to work again....