ViTAE-Transformer / DeepSolo

The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting"
Other
240 stars 31 forks source link

Chinese text recognition scene #3

Closed milely closed 1 year ago

milely commented 1 year ago

Hello~. Thanks for your good job. I want to ask whether the framework of deepsolo suitable for Chinese text recognition.Compared to the english in natural scene, Chinese text recognition scene has the characteristics of more number of categories and longer text lines.Moreover, The visual features of Chinese characters are more complex than Latin characters. I would like to know if you have tried Chinese character scenes, if I want to use Deepsolo for Chinese character scenes, do you have any suggestions?

ymy-k commented 1 year ago

Hi, it's a good question. Chinese spotting and multi-language spotting are in my plan. I'm working on this.

milely commented 1 year ago

Thank you for your reply. I synthesized a batch of Chinese datasets yesterday and tried it out. Using the current Deepsolo framework in the Chinese synthetic datasets has achieved good results. Looking forward to your multilingual version.

ymy-k commented 1 year ago

Good news. Thanks.

ymy-k commented 1 year ago

Hi, I evaluate DeepSolo on ReCTS recently. It achieves 78.3% 1-NED (voc_size: 5462, batch size 8, pretrained on SynthChinese130K+LSVT+ArT+ReCTS for 400K iterations, the same datasets as SwinTS, ABCNet-v2, ABINet++...), being 1.8% better than previous SOTA ABINet++ with far less training iterations. The result was made public on the official leadboard.

Gorgerbin commented 1 year ago

can you release the model trained on chinese dataset please?

ymy-k commented 1 year ago

Yes, it will be released in next few weeks.