Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Hi Everyone,
I'm training the layout model using a custom dataset for Layout with only Tables and Text.
I downloaded the CDLA model as a pretrained model to fine-tune on my data.
The dataset has 1200 document in a non-latin and non-chinese language.
The model didn't cross 20% accuracy till now.
What might be the issue?
Do I need to train for longer number of epcochs I set it to 100 initially?
Do I need to add more data?
What can be done to help the model generalize?
Hi Everyone, I'm training the layout model using a custom dataset for Layout with only Tables and Text. I downloaded the CDLA model as a pretrained model to fine-tune on my data. The dataset has 1200 document in a non-latin and non-chinese language. The model didn't cross 20% accuracy till now. What might be the issue? Do I need to train for longer number of epcochs I set it to 100 initially? Do I need to add more data? What can be done to help the model generalize?