rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

How to 'extract'? #191

Closed johnrosenbaud closed 7 years ago

johnrosenbaud commented 7 years ago

Summary:

It's mentioned that "Data files must be extracted to the Android device in a subdirectory named tessdata."

What does that imply? Is it referred to the specificeng.traineddata file only, or to:

    eng.cube.bigrams    
    eng.cube.fold   
    eng.cube.lm     
    eng.cube.nn     
    eng.cube.params     
    eng.cube.size   
    eng.cube.word-freq  
    eng.tesseract_cube.nn

How does one 'extract'?

I am getting a very low quality text output, something like 10% of a high-quality text screenshot.

Maybe I am not 'extracting' the files properly, or placing them in the right folder. Any clarification would be awesome.

Thanks!

rmtheis commented 7 years ago

Great question. If you're using OEM_TESSERACT_ONLY then you only need eng.traineddata. The files in your list with "cube" appearing in the name are needed if you're using Cube mode or combined mode.

The files used to be distributed as zipfiles and the "extract" term is a holdover from that. The files just need to be copied to the device as-is. If your init method succeeds, then Tesseract was able to open the data files.

johnrosenbaud commented 7 years ago

@rmtheis, thanks!