JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
24.74k stars 3.18k forks source link

Fine-Tune on Korean handwritten dataset #986

Open khawar-islam opened 1 year ago

khawar-islam commented 1 year ago

Dear @JaidedTeam

I would like to fine-tune the EASY OCR in the handwritten Korean language, I am assuming that the pre-trained model is already trained in Korean and English vocabulary and I will enhance the Korean handwritten accuracy on EASY OCR. How do I achieve it? I know how to train custom models but due to the large size of English datasets, I don't want to train in Korean and English from scratch. I have already 10 M KOREAN handwritten images.

Regards, Khawar

MdotO commented 1 year ago

Dear @JaidedTeam

I would like to fine-tune the EASY OCR in the handwritten Korean language, I am assuming that the pre-trained model is already trained in Korean and English vocabulary and I will enhance the Korean handwritten accuracy on EASY OCR. How do I achieve it? I know how to train custom models but due to the large size of English datasets, I don't want to train in Korean and English from scratch. I have already 10 M KOREAN handwritten images.

Regards, Khawar

Just my two cents if its of any use. I assume you dont need training for detector model as it should be very accurate. . You may freeze the freeze extraction layer and maybe even the sequential layer of the recognition model and then fine tune only on the Korean dataset with a relatively small learning rate. Modern ML models have loads of methods to reduce multivariate shift so this approach can work imo (it did lead to a relatively better thai lang model at my side).

khawar-islam commented 1 year ago

Dear @MdotO thanks for your answer. Yes, you are right, I am working on recognition model and currently fine-tuning Korean recognition model with approximately 6M images. I changed several parameters in .yamlfile and if you can recommend me some better parameter please. At the moment, the Best_accuracy on [214000/900000] is : 6.667 and model cannot further increase the performance. 6.667 I got on [30000/900000]

YAML File

batch_size: 128 #32
workers: 16
num_iter: 900000
lr: 1.
# Model Architecture
Transformation: 'None'
FeatureExtraction: 'VGG'
SequenceModeling: 'BiLSTM'
Prediction: 'CTC'
num_fiducial: 20
input_channel: 1
output_channel: 256
hidden_size: 256
decode: 'greedy'
new_prediction: True
freeze_FeatureFxtraction: False
freeze_SequenceModeling: False
MdotO commented 1 year ago

is 6.67 the accuracy or some loss value ? seems like loss as 6.67 % accuracy itself is extremely low. There can be many things potentially to be done not involving yaml(I am not sure of those details) but even for the parameters themselves, there are a few things to try out !

  1. I believe the lr is too high: should be around 0.001(can increase or decrease ten fold further)
  2. you can freeze the two layers freeze_FeatureFxtraction freeze_SequenceModeling by setting both of them to true or atleast feature extraction layer to True
  3. However, for 2 to work, you need to use the JADED AI's korean lang trained weights as initial weights by specifying saved_model: 'path_to_korean_lang_weights) in yaml file otherwise there is no use in freezing(in fact it will have adverse effects
  4. You may have a look at JADED AI's sample yaml filtered to see all expected fields, esp the lang char field as those determine the language characters you want to detect. (not specifying will I guess lead to auto determination but it's good to specify those urself imo) Good luck
khawar-islam commented 1 year ago

Thank you for your suggestion. 6.67 is Best accuracy, you can see below.

[131000/900000] Train loss: 0.00137, Valid loss: 8.52225, Elapsed_time: 175221.94042
Current_accuracy : 3.333, Current_norm_ED  : 0.5855
Best_accuracy    : 6.667, Best_norm_ED     : 0.6071
  1. Yes, I changed lr .1 to 0.001
  2. Yes, I freeze it now, my FeatureExtraction: VGG because I am using korean_g2.pth model
  3. Yes, I provide saved_model: /media/cvpr/CM_24/EasyOCR/easyocr/model/korean_g2.pth
  4. Yes, I have changed some fields in .yml file and now start fine-tuning again

Once again thanks for your suggestion now lets wait for the result

akiyomov commented 11 months ago

Hello, how was the result?

khawar-islam commented 11 months ago

@akiyomov not that much bad