mailong25 / self-supervised-speech-recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework
379 stars 115 forks source link

is it ok, To do specAugment and also adding noise to audio, then train it out. #41

Closed vigneshgig closed 3 years ago

vigneshgig commented 3 years ago

Hi @mailong25 , Thanks for your great working especially combining the language model in this asr system. I have one doubt is adding noise to audio will affect the performance. because I searched regarding this augmentation in your GitHub and fairseq wav2vec GitHub but I unable to find any reference regarding the noise augmentation or other augmentation. Why So?

I have read the wav2vec for simplicity it works like a clustering thing using contractive loss. So I feel like even though if we add noise to the original audio anyway it's going cluster the sampled part of the audio to the nearby neighbor embedding.

Thanks

mailong25 commented 3 years ago

from the paper, the author said that they did apply SpecAugment during fine-tuning

vigneshgig commented 3 years ago

ok thanks for the reply and I will check and try to add the specaugment to wav2vec as a live augmentation. If you already know regarding merging specaugment to wav2vec please let me know. @mailong25

vigneshgig commented 3 years ago

And is it ok to do specaugmentation as a live augmentation, because the training data continuously get changed before the embedding gets mapped and clustered properly?. I think for finetuning it's ok because it has a transcript to get mapped. if I am wrong please let me know.

mailong25 commented 3 years ago

I think it is unnecessary to do specaugmentation during pre-training because the model is trained on a large amount of unlabeled data, so it can not overfit the data easily

vigneshgig commented 3 years ago

Exactly, but I am talking about finetuning process. if you have any idea regarding adding specaugmentation for finetuning wav2vec with labeled dataset as a live augmentation, please let know me. if any tips or hints given means, it will be helpful for me. Thanks