I have been trying to train tesseract so it can read the font on LED screen which have slightly different shaped characters.
My current process was to 1)Install tesseract and make sure it was running. 2)clone tesstrain and add eng.traineddata from tessdata_best repo to the data folder. I used this guys code (https://www.youtube.com/watch?v=KE4xEzFGSU8) to generate the ground truth folder for all 195k line in eng.training_text. Copied the ground truth folder into tesstrain/data and ran make tesseract-langdata beforehand to have the langdata folder inside.
After all this I used this command make training MODEL_NAME=abs START_MODEL=eng TESSDATA='tessdata path here' MAX_ITERATIONS=20000
Now I have done this more than once and never achieved error rate under 50%. I am not sure what I am doing wrong or if this error rate is normal. If anyone has any suggestions or if the post is missing something, please let me know.
I have been trying to train tesseract so it can read the font on LED screen which have slightly different shaped characters. My current process was to 1)Install tesseract and make sure it was running. 2)clone tesstrain and add eng.traineddata from tessdata_best repo to the data folder. I used this guys code (https://www.youtube.com/watch?v=KE4xEzFGSU8) to generate the ground truth folder for all 195k line in eng.training_text. Copied the ground truth folder into tesstrain/data and ran
make tesseract-langdata
beforehand to have the langdata folder inside. After all this I used this commandmake training MODEL_NAME=abs START_MODEL=eng TESSDATA='tessdata path here' MAX_ITERATIONS=20000
Now I have done this more than once and never achieved error rate under 50%. I am not sure what I am doing wrong or if this error rate is normal. If anyone has any suggestions or if the post is missing something, please let me know.