Open shakeel608 opened 1 year ago
So, what steps are you facing problems with?
If we use off the shelf Wav2vec2 pre-trained model, it is easier to finetune it on the downstream task. But I want is to fine tune the pre-text task without classification head on another dataset. Imean I want to fine tune base model here.
From the repo, it is showing Train a wav2vec 2.0 base model
to train from scratch
I didn't find any steps to proceed with on the github repo, how to do it.
Any directions would be appreciated?
You can use this for reference: https://github.com/mailong25/self-supervised-speech-recognition. This is a nice wrapper for wav2vec2. I think Fairseq is lack data preparation examples as Espnet does.
My question is that I want to fine tune the pre-text task of Wav2Vec2.0 base model (not from scratch). Any help would be appreciated?