Open anavc94 opened 4 years ago
Hello :)
They are open questions, and probably they all are up to the dataset, so they are hard to answer...
If I were you, I would start an experiment using char74k. Training from scratch would work, I have trained with char74k before, and it worked well. But I am not sure that is better than fine-tuning from the pretrained model. (I have not tested/compared it)
Remember to change image size 32x100
to 32x32
or something for the character recognition.
I wish you good luck, Best.
Hi @ku21fan !
Thank you for your suggestion! In fact, I was preparing a train with the char74k dataset + ICDAR2003 + some images of my own dataset (about 100). Maybe I will add some more datasets in the future such as ICDAR2005 or MNIST. As you have not compared training from scratch vs. fine tuning the pretrained model, I will update with my results so that everybody can have a reference, but it is good to know that you have been able to train from scratch with char74k :)
Regards! Ana
@anavc94, Hi. Did you tried to predict words with "character level detection and recognition", i.e to learn to predict words with only learning on the characters data-set ? Did it work ? I tried it with in my own language and a characters data-set, and it learned to identify words with accuracy about 80%, but It didn't learned to predict sequences of words. I wonder if you had any other results or interesting insights.
@ilyak93 ,Hai.. I am trying character-level detection.. If your code is working with 80% then if you share the code or logic it will be help for me. Thank you
i have the same thing. using an existing pretrained model (TPS-ResNet-BiLSTM-Attn.pth), some numeric characters cannot be recognized properly. i want to do FT (fine-tune) with my data. if there is any source code for this thing. Please help me. Thankyou
Hi,
I am using the CRAFT + this library pipeline to locate and do char level recognition using a pretrained model you provided (TPS-ResNet-BiLSTM-Attn-case-sensitive.pth). However, I am stuck in a point where the recognizer does not seem to give better results and I want to train my own model with my own data. My own data would consist in a set of image of extremely low resolution (about 20x40 or so) as they are character (numbers/letters) images, not words. Luckly, all the characters I want to recognize look similar, i.e. similar fonts and size. These are examples of images I would like to recognize:
as "3, U, 7".
The approach I think I am going to use is fine tuning the pretrained model TPS-ResNet-BiLSTM-Attn-case-sensitive.pth:
-> in this case, I wonder how many images per kind of character should I use? I really need this info. -> if I only train for some digits (for example, for digits "1" and "7" as they are often confused), can the rest of my letters/digit get worse? -> would it be good if a use my own data + a public char recognition dataset such as: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/ ? I think it would be easier for me as I don't have so many data, -> maybe is it better to train from scratch with my own data + char74k?
Hope someone can guide me a bit. Btw, congratulations for both respositories, they are pretty good!
Ana