kba / kraken-docker

Docker container for the kraken OCR engine
https://hub.docker.com/r/kbai/kraken/
MIT License
5 stars 6 forks source link

hocr output is not working #4

Closed zuphilip closed 7 years ago

zuphilip commented 7 years ago

I tried to output hocr and it did not work (also alto did not work), but txt seems to work fine. I already updated the submodule and rebuild the container, but this didn't change anything. Maybe this is a problem which is for reporting upstream, but I don't know that...

$ docker run --rm -it -v ${PWD}:/work kraken-docker -i /work/bw.png /work/out.t
xt segment ocr
Loading RNN
Segmenting
Processing
Writing recognition results for /work/bw.png

$ docker run --rm -it -v ${PWD}:/work kraken-docker -i /work/bw.png /work/out.h
ocr segment ocr -h
Loading RNN
Segmenting
Processing
Writing recognition results for /work/bw.png    Traceback (most recent call last
):
  File "/usr/bin/kraken", line 10, in <module>
    sys.exit(cli())
  File "/usr/lib/python2.7/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1087, in invoke
    return _process_result(rv)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1025, in _process_
result
    **ctx.params)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/kraken/kraken.py", line 144, in process
_pipeline
    task(base_image=base_image, input=input, output=output)
  File "/usr/lib/python2.7/site-packages/kraken/kraken.py", line 120, in recogni
zer
    fp.write(serialization.serialize(preds, base_image, Image.open(base_image).s
ize, ctx.meta['mode']))
  File "/usr/lib/python2.7/site-packages/kraken/serialization.py", line 123, in
serialize
    tmpl = env.get_template(template)
  File "/usr/lib/python2.7/site-packages/jinja2/environment.py", line 812, in ge
t_template
    return self._load_template(name, self.make_globals(globals))
  File "/usr/lib/python2.7/site-packages/jinja2/environment.py", line 774, in _l
oad_template
    cache_key = self.loader.get_source(self, name)[1]
  File "/usr/lib/python2.7/site-packages/jinja2/loaders.py", line 235, in get_so
urce
    raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: hocr

(The example is from the kraken/tests/resources directory.)

kba commented 7 years ago

This is a problem with PBR, the templates folder isn't installed. Actually, when I tried now, I failed to install kraken locally, first due to some "git not installed" nonsense fixed with export PBR_VERSION then with installation succeeding but w/o templates. So this is an issue with upstream kraken or my misunderstanding the install process. Fix is to install templates manually.

kba commented 7 years ago

BTW: If you use /data as the volume target, you won't need absolute paths since WORKDIR /data.

docker run --rm -it -v ${PWD}:/data kraken-docker -i bw.png out.h
ocr segment ocr -h
zuphilip commented 7 years ago

Yeah, this works now. Thank you very much!

(I used now kbai/kraken instead of kbai/kraken-docker.)

kba commented 7 years ago

(I used now kbai/kraken instead of kbai/kraken-docker.)

They are the same, both automatically update. Someone seems to be using the latter, so I'll keep it around.