Closed josef821 closed 2 years ago
tesstrain repo is useful when you have scanned line images and their groundtruth transcription.
Use text2image and lstm.train to create lstmf files (use tesstrain.sh bash script). You will need to run lstmtraining after that.
i want to add some new font to fas tessdata_best. what is your prefer ? create groundtruth and use tesstrain OR Use text2image and lstmtraining ? Should fonts be used randomly during training or should I train each font separately and combine each output file at the end?
Should fonts be used randomly during training
Yes.
create groundtruth and use tesstrain
Yes. Because, it will run lstmtraining for you.
hi. i want to train new font and character image to fin lang. i want to train character with noise and angle. how can i use this files : desired_characters fin.numbers fin.punc fin.singles_text fin.training_text fin.unicharambigs fin.unicharset fin.wordlist okfonts.txt
to get .traineddata files like tessdata_best. should i use tesstrain ( https://github.com/tesseract-ocr/tesstrain ) or use text2image and create box then train ( https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html )