Open cvarun94 opened 2 years ago
Usually, text recognition models are not able to obtain a satisfying performance without sufficient data. A common practice is training the model with ST and MJ datasets and fine-tuning the model on the training set of specific tasks. You might use our pretrained models as your starting point.
Usually, text recognition models are not able to obtain a satisfying performance without sufficient data. A common practice is training the model with ST and MJ datasets and fine-tuning the model on the training set of specific tasks. You might use our pretrained models as your starting point.
Hi! Thanks for the tip.. how can I use the pretrained model to finetune it on my custom data? I couldn't find any resource for it.. also, what is the purpose of "repeat" parameter.. I didn't fully understand it.
Hi @cvarun94 , you can refer this Unofficial Notebook. Note about the repeat
also included in the Notebook
Hi @cvarun94 , you can refer this Unofficial Notebook. Note about the
repeat
also included in theNotebook
@balandongiv thank you so much.. honestly that unofficial notebook was very helpful..
Also, I have my own custom dataset - cropped images of handwritten dates in different formats.. such as "12-January-2022", " 12-01-2022", "12-Jan-2022" , "12/01/2022" and so on.. so in such cases will I have to provide my custom dictionary of characters? Where can I get resources on training custom data. and related configs to be changed?
Thanks
I am not that expert, but based from this discussion, I will definitely used the default 'DICT90'. Btw, Please ignore the
sectionabout
dictin the Notebook. I used that particular
cell` to ask question in this forum (I have disable that particular lines to avoid confusion)
The unofficial notebook
should help you understand on how train on your custom dataset.
Hi, I am using the ABINet to train the RRC Focussed Text 2013 challenge dataset. (download link - https://rrc.cvc.uab.es/?ch=2&com=downloads). But after several tweaks to the config files I was able to train it, but the loss is always hovering around 8 and the predictions are poor.
Any help, tips, suggestions regarding it would be appreciated. Attaching the config file and the sample prediction for reference.
Thanks
Prediction--
config file-- varun_abinet_config_30062022.txt