Closed Gorgerbin closed 1 year ago
@ymy-k Thanks a lot. Now I know I can refer to the ABCNet model using "chn_cls_list.txt", but I'm confused if I can use the pretrained ViTAEv2-S model. It seems not good because the voc_size doesn't match.
It's pretrained on English data. Thus, it's not a good choice to use it. The vos_size doesn't match and the linear layer for character classification is not useable.
@ymy-k So kind of you. BTW, when will the Chinese model be available?
Maybe this week, I will update the Chinese model first.
Thank you and hope to release sooner.
Hi, the code and models for ReCTS have been updated.
Thank you so much!!
Hi. The major difference lies in character classes. (1) You should prepare a character list for a new language and you will know the total character classes. In the config file, set MODEL.TRANSFORMER.VOC_SIZE to the number of character classes. (2) Prepare your data. In the json file, 'rec' is the character index list converted from text transcript. (3) Remember to change the character list in evaluation code and visualization code for evaluation and visualization.
Note: In the CTC decoding part of evaluation and visualization code, because the character list additionally includes an "unknown" class which is not shown in the character list and can be ignored during inference, "if c < self.voc_size - 1" is used (such as here and here). Otherwise, for example, if the new English character class is 36 but not 37 (i.e., the "unknown" class is not included), using "if c < self.voc_size" is correct. Remember to check it for new dataset.