Closed bertsky closed 6 years ago
So the tests work because they use the ocrd_tesserocr
API directly, right? I ran into that encoding issue with different CI platforms, yes we need to override LC_*
after loading click. We should add tests that use the CLI as well. I'll look into it.
Oh you fixed it already. Self-fixing issues are the best issues :+1:
No, the tests (presumably) still work, because they use an older Tesseract version without the assertion in place. True, we should add a CLI test (as in core/test/test_cli.py)...
No, the tests (presumably) still work, because they use an older Tesseract version without the assertion in place.
They did fail before but we set the locale to C
before running the tests
https://github.com/OCR-D/ocrd_tesserocr/blob/master/.travis.yml#L32
Oh, I see. I was distracted by my (non-container) test results.
Ever since Tesseract 4 had to introduce an assertion that localization be plain POSIX (
C
) to ensure certain legacy assumptions in its code are always met, we have to override the current locale before initializingtesserocr
API, too. This cannot be done by the user before calling anyocrd_tesserocr
CLI, because we depend on the Click library, which itself is incompatible (in Python 3) with that locale (it requires at leastC.UTF-8
). So we have a deadlock.We could perhaps reset the locale after
click
and beforetesserocr
though.