Open zw76859420 opened 8 months ago
The current finetuning pipeline of FunASR does not support directly modifying the subword vocabulary to add OOV vocabulary for finetuning. If there is a need for this, the following modifications need to be made:
1)Modify Tokens.txt It is recommended to expand it further.
2)Modify model.pb After adding modeling units, the output layer of the model needs to be expanded accordingly, and the connections of the newly added modeling units are initialized randomly.
3)Finetuning the model Since OOV vocabulary has been added, the training data needs to have sufficient coverage so that it can be recognized after training.
Hope it will be helpful!
多谢良博即时解答,明白了,期待Funasr越来越好!构建完整ASR生态
大佬,请教下: 我们有几个oov词添加到词典中进行训练,添加步骤如下: 1)词典添加(tokens.txt): 眀