OCR-D / ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces
MIT License
38 stars 11 forks source link

document model installation #173

Closed bertsky closed 3 years ago

bertsky commented 3 years ago

After #166 the documentation regarding model requirements is not correct anymore (if it was before). Because the std path has changed, the installation steps have changed as well. Also, we should point out more clearly that there will be no segmentation without eng.traineddata and no deskewing without osd.traineddata. Unfortunately, the corresponding error message is not helpful at all:

  File "/build/ocrd_tesserocr/ocrd_tesserocr/deskew.py", line 68, in process
    psm=PSM.AUTO_OSD
  File "tesserocr.pyx", line 1210, in tesserocr.PyTessBaseAPI.__cinit__
    self._init_api(cpath, clang, oem, NULL, 0, NULL, NULL, False, psm)
  File "tesserocr.pyx", line 1223, in tesserocr.PyTessBaseAPI._init_api
    raise RuntimeError('Failed to init API, possibly an invalid tessdata path: {}'.format(path))

But that probably needs to be fixed somewhere between Tesseract and tesserocr.

bertsky commented 3 years ago

Fixed by #175 (more or less).