tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
599 stars 178 forks source link

How does one turn (.tif, .gt.txt, .box) into (.lstmf) #339

Closed Turbine1991 closed 1 year ago

Turbine1991 commented 1 year ago

I've really been banging my head against the wall on this one.

After converting .ttf fonts into .box and .tif using a .gt.txt file for generated text. I've been unable to find an up-to-date method of taking the next step.

I've followed everything I could from the Readme.md, to no avail.

When I run what I was told was the next step, I get this error:

TESDATA_PREFIX=/home/nom/src/tesstrain/tessdata/ make training MODEL_NAME=eng START_MODEL=eng TESSDATA=/home/nom/src/tesstrain/tessdata/ MAX_ITERATIONS=2000

make: *** No rule to make target 'data/eng-ground-truth/eng.training_text.lstmf', needed by 'data/eng/all-lstmf'.  Stop.

I finished my application, however training it for the correct fonts is torture.

Turbine1991 commented 1 year ago

I misunderstood the instructions. The .gt.txt files are for each individual .tif file.