Open nabil6391 opened 1 year ago
Hi, we didn't try cross-language finetuning before, but we can work this out together.
First of all, have you managed to create your own dataset using Lhotse?
Not yet, I am trying to follow yesno recipe and librispeech. As far as I understand I have to create the manifest files first, then I can compute_fbank, prepare_lang and compute_hlg.
https://github.com/k2-fsa/icefall/blob/master/egs/yesno/ASR/prepare.sh https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/prepare.sh
As far as I understand I have to create the manifest files first
Yes. This should be the first step. BTW, how many hours of transcribed speech do you have? Are they open-sourced?
I was a very small dataset. For starters, I am considering using the common voice Bengali Dataset, which is larger than my one. I believe there are other bigger datasets as well but have to check.
OK, once you finished preparing the dataset, you can start training from scratch (as a baseline to finetune). If you encounter any problems, feel free to ask here.
Amazing work indeed, I have already trained one language (Transducer7_streaming + BPE) and would also like to finetune to a lower resource language, with different Alphabet, but same number of tokens.
What steps should I take ?
@joazoa I assume you are using a different output vocabulary? Then you can only initialize the encoder, the decoder and joiner need to be trained from a random initialization.
@marcoyang1998 how do I initialize the encoder, the decoder and joiner from a random initialization.
I think what @marcoyang1998 meant was that if you have a different output vocabulary, you would need to randomly initialize the decoder and joiner. But the encoder can be initialized from your final checkpoint. You can do something like model.encoder.load_state_dict(ckpt["model"], strict=False)
.
Thanks for replying @desh2608 , I am still quite new to k2-icefall.
Should I modify and put this part in the train.py file?
Thanks for replying @desh2608 , I am still quite new to k2-icefall.
Should I modify and put this part in the train.py file?
Yes. The train.py script is plain PyTorch, so you can modify it to your requirements as you would any other PyTorch training code.
Hi Guys, amazing works with the icefall recipes. I am quite new to using the recipes and having a hard time creating a custom dataset using lhotse that I have for my language (Bengali).
I have seen @marcoyang1998 adding some finetuning scripts for librispeech to Gigaspeech and Wenetspeech to Aishell. I am a bit confused how to do finetune for another language.
Please help me if possible. Thanks and great work!