mlpc-ucsd / TESTR

(CVPR 2022) Text Spotting Transformers
Apache License 2.0
179 stars 22 forks source link

How to change character set? #12

Open jeong-tae opened 1 year ago

jeong-tae commented 1 year ago

Hi, I'd like to train with different language datasets, such as Chinese, Korean, and Japanese, so I have to change the character set rather than the default setting.

Some detectron based models give character set configuration, but I can't find it here. Can you guide me on how to change the character set?

jeong-tae commented 1 year ago

For Chinese characters, I changed VOCA_SIZE to the length of chn_cls_list, which going around many Chinese spotting repositories, and set BATEXT SIZE, CLS to the voca_size and chn_cls. It works.

I will try other languages too and let you know if it works. hope it helps you.

learningsteady0J0 commented 1 year ago

@jeong-tae

Thank you very much for sharing

Did the pre-trained model use the polygonal model learned with syntext data(english only)?

I'm trying to learn in Korean. Currently, the loss of ctrl points is very high at 40-50, so is there any tip to lower it?

jeong-tae commented 1 year ago

I trained the model from scratch for the Chinese dataset. I think pre-trained from syntext_data(English only) may not be trained well with the Korean dataset. Because the recognizer for the English dataset and Korean do not have the same labels. (For example, if 'A', 'B', 'C' have class 1, 2, 3 each in English, but in Korean, may have 10, 11, 12 like that)

If train goes long, it may fit to Korean, I am not sure

milely commented 1 year ago

@jeong-tae Thanks for sharing your experience. Can the model achieve good results when switching to a Chinese dataset with a large number of categories? Could you share your results if possible. Thanks in advance.

jeong-tae commented 1 year ago

@milely On Chinese character set, size 5700(not sure), it works very well. I submitted the results on Icdar ReCTS and it was ranked... maybe 7? or 10? anyway, it works well.

milely commented 1 year ago

@jeong-tae Thank you very much for sharing, I will also try other languages.

ninoogo2 commented 1 year ago

@jeong-tae

You're a god. Thank you so much

I'm also experimenting with Chinese, but I keep getting errors. Could you share the code you used?

I'll analyze the code on my own. Please May there be infinite glory in your future

jeong-tae commented 1 year ago

@ninoogo2 sorry i cant share my code but you can easily modify your code to make it work. just set your character set

zx1239856 commented 1 year ago

Thanks for the interest of you all. @jeong-tae's approach is correct.

I'd like to add that you may refer to AdelaiDet (which contains ABCNet and ABCNet v2 implementations) for training on non-Latin datasets, e.g. Chinese.

Link: https://github.com/aim-uofa/AdelaiDet/blob/master/configs/BAText/ReCTS/v2_chn_attn_R_50.yaml#L17-L18

A larger VOC_SIZE (5000+) is used with a custom dictionary for inference and evaluation.

Pretraining can be leveraged to enhance performance. You may use a mix of ChnSyn, ReCTS, LSVT datasets as in ABCNet (https://github.com/aim-uofa/AdelaiDet/blob/master/configs/BAText/Pretrain/Base-Chn-Pretrain.yaml) and finetune on ReCTS. Since the annotations provided by ABCNet are Bezier curves, it is compatible with the Bezier variant of our model if you don't want to convert annotations.

jeong-tae commented 1 year ago

@learningsteady0J0

I am trying to train with Korean dataset and I can't find the issue you mentioned. All the training losses are higher than other languages, I think this is because I trained this from scratch without pertaining. ctr point loss is little high but not 40~50...

Anyway, loss is so big that I failed to train well for Korean set.

learningsteady0J0 commented 1 year ago

@jeong-tae I really appreciate your attention! I think I improved the performance a little by continuing the experiment. However, there is still a problem, the precision value in detection is relatively low. Is there a strategic way to raise it?

image

jeong-tae commented 1 year ago

@learningsteady0J0 are you trying to reproduce icdar15 result? if you follow the experiment well, then the result shows a good result

if you trained with a Korean set and then evaluated on icdar15, ...I don't know. These two may have different distributions so you can't tune precisely.

I found that my Korean set has an error and fixed it. It seems it will work well.

Zalways commented 1 year ago

For Chinese characters, I changed VOCA_SIZE to the length of chn_cls_list, which going around many Chinese spotting repositories, and set BATEXT SIZE, CLS to the voca_size and chn_cls. It works.

I will try other languages too and let you know if it works. hope it helps you.

you mentioned set BATEXT SIZE, CLS to the voca_size and chn_cls, i don't find this parameter in the default config files,could you help me with Chinese config file? image i just add these setting into the config file ,any thing else i need to add? looking forward to your reply! thanks!

jeong-tae commented 1 year ago

@Zalways it's been a while that I did. hmm... I think that's all you need. if you set the path correctly for training, it will work