tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
626 stars 180 forks source link

'make training' fails on default data #70

Closed ojmakhura closed 5 years ago

ojmakhura commented 5 years ago

I'm trying to learn how to train a tesseract model. however when I run make training, I get the following error and I can't figure out why.

mkdir -p data/checkpoints lstmtraining \ --traineddata data/foo/foo.traineddata \ --net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1chead -n1 data/unicharset]" \ --model_output data/checkpoints/foo \ --learning_rate 20e-4 \ --train_listfile data/list.train \ --eval_listfile data/list.eval \ --max_iterations 10000 Failed to load list of training filenames from data/list.train make: *** [Makefile:146: data/checkpoints/foo_checkpoint] Error 1 The thing is, the file exists, it's just empty.

wrznr commented 5 years ago

We need more information to reproduce your problem.

wrznr commented 5 years ago

@ojmakhura Pls. reopen if the problem persists and you can provide more detailed information.

stweil commented 5 years ago

That file is typically empty if there is not enough ground truth available or found. Check data/all-lstmf to see whether that file is empty, too.

SourEyeglasses commented 5 years ago

I am also experiencing this same problem. I am attempting OCR on English handwriting and am fine tuning on top of the Eng traineddata. I am using Tesseract 4.0.0-beta.1.

The only times I have had the error happen is when canceling during training on accident or when some unknown issue caused connectivity problems resulting in the training being stopped short of the max iterations.

I run this command to continue training:

_make training MODEL_NAME=eng STEPS=6000 STARTMODEL=eng

When I attempt to continue training I get this error:

_lstmtraining \ --traineddata data/eng/eng.traineddate \ --old_traineddata /home/ubuntu/ocrd-train/usr/share/tessdata/eng.traineddata \ --continue_from data/eng/eng.lstm \ --net_spec "[1,36,0,1 Ct3,c,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 01c'head -n1 data/unicharset']" \ --model_output data/checkpoints/eng \ --learning_rate 20e-4 \ --train_listfile data/list.train \ --eval_listfile data/list.train \ --max_iterations 250000 Failed to continue from: data/eng/eng.lstm Makefile:131 recipe for target 'data/checkpoints/eng_checkpoint' failed make: *** [data/checkpoint/engcheckpoints] Error

The only solution I have found is running make clean and restarting.