wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
3.87k stars 1.04k forks source link

How to add new words during fine-tuning? #2467

Open srdfjy opened 2 months ago

srdfjy commented 2 months ago

Hi,a pre-trained model's unit.txt contains 1000 words. When fine-tuning based on this pre-trained model, there are 10 new words not in the unit.txt. At this point, adding these 10 new words to the end of unit.txt and assigning them new numbers, is this approach feasible?

fclearner commented 2 months ago

freeze other modules except the outputlayer(ctc output && attention decoder output), add new words to your unit.txt ,modify the output size and then tune the model

srdfjy commented 1 month ago

THX @fclearner,I will try out what you suggested later.