Closed gjerome5 closed 6 years ago
http://www.danvk.org/2015/01/11/training-an-ocropus-ocr-model.html This article is also helpful.
I don't understand your question. Please write clearer what you did and what didn't work.
I am working on Cursive hand written text recognition, As of now for printed text ocropus is working fine, but when comes to cursive hand writing recognition, it z not going well with default training model i.e en-default.gz, so we need to build a model for hand written things, do u have any idea on this?
Create ground truth with your italics text, see https://github.com/tmbdev/ocropy/wiki/Working-with-Ground-Truth , and then train with that see also at the links in https://github.com/tmbdev/ocropy/wiki .
IAM database comes with the ground truth for each of the text line.
Example of a line image and its corresponding truth
https://imgur.com/cVPg0Qo A MOVE to stop Mr. Gaitskell from Text Filename: a01-000u-00.gt.txt
This is what I had tried to train on IAM database:
python ocropus-rtrain --load models/en-default.pyrnn.gz -o ../train_models/IAM_full/my_models ../data/IAM_database/traindata/*.png --ntrain 500000
I loaded the default English model and trained over that. This was suggested in the section "Training with the default model" in http://www.danvk.org/2015/01/11/training-an-ocropus-ocr-model.html
Here are my observations:
1.) Ocropy is tailored for printed documents, for handwritten text see also: https://github.com/tmbdev/ocropy/wiki/FAQ#can-ocropus-be-used-for-handwritten-text-recognition
2.) Training in ocropy is not that fast, but you can also look at the C++ implementation https://github.com/tmbdev/clstm
Bros, i want to create a new training model like 'en-default.pyrnn.gz' How to this bro? I have my train set data as png files. so please help out to create a training model. Environment: