Closed mkczyk closed 4 years ago
I try to get rotate information (orientation angle) from Tess4j:
ITesseract instance = new Tesseract(); instance.setDatapath("/path/to/tessdata"); instance.setLanguage("osd"); instance.setPageSegMode(ITessAPI.TessPageSegMode.PSM_OSD_ONLY); instance.setOcrEngineMode(ITessAPI.TessOcrEngineMode.OEM_LSTM_ONLY); String result = instance.doOCR(new File("/path/to/image.png")); System.out.println(result);
But it cause error: Error: LSTM requested, but not present!! Loading tesseract. and returns empty string as response.
Error: LSTM requested, but not present!! Loading tesseract.
When I change TessOcrEngineMode from OEM_LSTM_ONLY to OEM_DEFAULT, that error disappears but it still returns empty string as response.
TessOcrEngineMode
OEM_LSTM_ONLY
OEM_DEFAULT
Maybe method doOCR isn't dedicated for getting OSD information?
doOCR
My environment and Tesseract are configured properly, because I can do OCR from Tess4j (it returns recognized text):
ITesseract instance = new Tesseract(); instance.setDatapath("/path/to/tessdata"); instance.setLanguage("pol"); instance.setPageSegMode(ITessAPI.TessPageSegMode.PSM_AUTO_ONLY); instance.setOcrEngineMode(ITessAPI.TessOcrEngineMode.OEM_LSTM_ONLY); String result = instance.doOCR(new File("/path/to/image.png")); System.out.println(result);
(I have two downloaded dictionaries: pol.traineddata and osd.traineddata for attempts with OSD.)
pol.traineddata
osd.traineddata
And I can get OSD information without Tess4j (directly from console to Tesseract):
$ tesseract --psm 0 -l osd image.png stdout Page number: 0 Orientation in degrees: 270 Rotate: 90 Orientation confidence: 0.35 Script: Latin Script confidence: 7.78
That's not the correct way to call. Look for testTessBaseAPIDetectOrientationScript test case in the unit tests.
testTessBaseAPIDetectOrientationScript
Can the ticket be closed?
I try to get rotate information (orientation angle) from Tess4j:
But it cause error:
Error: LSTM requested, but not present!! Loading tesseract.
and returns empty string as response.When I change
TessOcrEngineMode
fromOEM_LSTM_ONLY
toOEM_DEFAULT
, that error disappears but it still returns empty string as response.Maybe method
doOCR
isn't dedicated for getting OSD information?My environment and Tesseract are configured properly, because I can do OCR from Tess4j (it returns recognized text):
(I have two downloaded dictionaries:
pol.traineddata
andosd.traineddata
for attempts with OSD.)And I can get OSD information without Tess4j (directly from console to Tesseract):