help for bad result on test data

JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Apache License 2.0

23.5k stars 3.08k forks source link

I would guess your dataset is not diverse enough, meaning the model gets overfitted to the custom data you are feeding it. In order to deal with this you need a large diverse dataset, which you can for example do with synthetic generation. I made an article in TowardsAI about that here if you are interested: https://pub.towardsai.net/how-to-make-a-synthesized-dataset-to-fine-tune-your-ocr-3573f1a7e08b. Also, I would test training less with your data (for example by lowering learning rate, or running fine-tuning for fewer epochs/iterations), as this can help prevent overfitting.

JaidedAI / EasyOCR

help for bad result on test data #941