Closed SaddamBInSyed closed 4 years ago
To use a different model, specify extra_cmdline_params="-l osd"
(assuming osd.traineddata
is the new model you created).
As for improving the accuracy - besides trying to train a dedicated tesseract model (although, I must admit, I do not know of examples where one managed to obtain statistically significant benefits with custom models), perhaps you could make sure the input images are as clear as possible.
One common issue, that is handled very poorly by the current implementation has to do with the situation, where the document lies on some kind of a patterned background (e.g. a table).
You can try running the mrz
script with the --save_roi
parameter on the badly recognized examples and examine the regions extracted by the pipeline. If the region is correct (i.e. includes the actual MRZ in correct orientation), tuning tesseract is the way to go. If the region is usually incorrect, then the problem lies in the image preprocessing.
If you discover an useful way to process images which you think should be added to the current PassportEye pipeline, let me know!
Hi @konstantint
Thanks for your work. I am using this lib to extract MRZ values from the National ID. I am not satisfied with the current performance which I am getting from this MRZPipeline() class. so I would like to ask
I have installed tesseract 4.0 (tesseract-ocr-setup-4.00.00dev.exe) setup and I get the below files in tessdata folder
please advise