clovaai / deep-text-recognition-benchmark

Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Apache License 2.0
3.77k stars 1.11k forks source link

Fine Tuning model for national ID cards #301

Closed amroghoneim closed 2 years ago

amroghoneim commented 3 years ago

Hello, I am aiming to fine tune this model to work for my use case, which is national ID card text extraction. I am somewhat new to this so any advice would be appreciated. The IDs are in Arabic. So what I want to know is how to do the fine tuning of the model? from what I understand, the model will only train on the training data I have, but I want a model that already gives good predictions for Arabic and fine tune that for the use case I mentioned above. Also, my line of thinking is that I can crop the words and numbers from my IDs where each word and number will be used for training. Is this the best approach, especially for Arabic?

amroghoneim commented 3 years ago

Any help would be appreciated.

nikkhilAvira commented 2 years ago

Hi did you end up figuring this out?

amroghoneim commented 2 years ago

@nikkhilAvira Hey apologies for the late reply. I put the project on hold due to other priorities, but I started off by using the text generator library recommended by EasyOCR to generate data similar to what I need and used that data to train a custom model (I stopped here though before getting good results). Within the training scripts provided (also within easyOCR), I remember that the config file gave the option to freeze the backbone of the model, such that you can benefit from the pretrained backbone while training the final layers.

Mahmuod1 commented 2 years ago

@nikkhilAvira Hey apologies for the late reply. I put the project on hold due to other priorities, but I started off by using the text generator library recommended by EasyOCR to generate data similar to what I need and used that data to train a custom model (I stopped here though before getting good results). Within the training scripts provided (also within easyOCR), I remember that the config file gave the option to freeze the backbone of the model, such that you can benefit from the pretrained backbone while training the final layers.

Hello Amr can you share the dataset you generate with generator library recommended by EasyOCR you used