tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
62.14k stars 9.5k forks source link

When using tesseract for LSTM training,Iteration termination,no error message #3014

Open xiongwenliu opened 4 years ago

xiongwenliu commented 4 years ago

operating system:windows10/x64 Testeract version:v5.0.0.20190623 Training font:Trained models with support for legacy and LSTM OCR engine My steps:1、Using jtessboxeditor to generate TIF 2、tesseract a2.test.exp0.tif a2.test.exp0 -l chi_sim --psm 6 lstmbox 3、Using jtessboxeditor to correct characters 4、tesseract a2.test.exp0.tif a2.test.exp0 -l chi_sim --psm 6 lstm.train 5、generate a2.training_files.txt 6、combine_tessdata -e chi_sim.traineddata chi_sim.lstm 7、lstmtraining --model_output="D:\Users\Welcome\Desktop\ocr\xl" --continue_from="D:\Users\Welcome\Desktop\ocr\xl\chi_sim.lstm" --train_listfile="D:\Users\Welcome\Desktop\ocr\xl\a2.training_files.txt" --traineddata="D:\Users\Welcome\Desktop\ocr\xl\chi_sim.traineddata" --debug_interval -1 --target_error_rate 5 8、lstmtraining --stop_training --continue_from="D:\Users\Welcome\Desktop\ocr\xl\output_checkpoint" --traineddata="D:\Users\Welcome\Desktop\ocr\xl\chi_sim.traineddata" --model_output="D:\Users\Welcome\Desktop\ocr\xl\my_chi_sim.traineddata"

results: image

And when I use the best traineddata,No problem,but It does not support tes4j-3.4.8 in Java image

xiongwenliu commented 4 years ago

@Shreeshrii Hello, is there any good solution

BlingHe commented 4 years ago

我遇到了同样的问题 请问你解决了吗

stweil commented 4 years ago

我遇到了同样的问题 请问你解决了吗

Translation: "I encountered the same problem. Did you solve it?"

Shreeshrii commented 4 years ago

@xiongwenliu Sorry, without seeing the files you are using, it is difficult to guess what's the problem.

How many images are you training with? How many lines of data?

zoui520 commented 3 years ago

I encountered the same problem. Did you solve it?