Shreeshrii / tess5train-fonts

Files and Scripts to run Tesseract 5 LSTM Training using fonts
Apache License 2.0
76 stars 21 forks source link

How to build training tools for tesseract 4.0 on windows #12

Closed Harathi123 closed 4 years ago

Harathi123 commented 4 years ago

Hi @Shreeshrii ,

I am trying to train tesseract 4.0 on windows. But when i installed tesseract 4.0, i didnt find lstmbox file and many other training tools. When i did some research, i found that we need to build them. I didnt find any documentation how to build training tools for tesseract 4.0 on windows. Can you please guide me?

Thanks in advance! Harathi

Shreeshrii commented 4 years ago

https://github.com/UB-Mannheim/tesseract/wiki

Shreeshrii commented 4 years ago

The above has all training tools.

lstmbox is a config file. You need to make sure that you don't have old version of tessdata (referred by TESSDATA_PREFIX).

Harathi123 commented 4 years ago

Thanks for the reply! I am planning to use tesseract 4.0 beta. I am confused with many links in the above link you shared. Can you please point me to the right one for training tools? Also regarding tessdata, I installed this: https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-4.0.0dev-20170510.exe With that, I didn’t get lstmbox in tessdata/configs. Do I need to download tessdata folder separately?

Thanks! Harathi

Shreeshrii commented 4 years ago

Why do you want to use tesseract 4.0 beta? That is pretty old. You can't use the latest features with code from 3 years ago.

https://github.com/UB-Mannheim/tesseract/wiki says

The latest installers can be downloaded here:

http://bhajans.ramparivar.com

Harathi123 commented 4 years ago

Ok thanks, I will try with latest version. Do I need to install any training tools to train with this? Or tesseract installation will download all the required files? Thanks! Harathi