Closed jackNhat closed 6 months ago
I haven't personally, but would be interested to know if anyone has tried this. Note that the Whisper tokenizer does not contain phoneme tokens, so the model will require a new tokenizer to be trained and subsequently the vocabulary size to be adjusted (c.f. Wav2Vec2PhonemeCTCTokenizer and https://discuss.huggingface.co/t/adding-custom-vocabularies-on-whisper/29311/2?u=sanchit-gandhi)
Has anyone experimented with fine-tuning the phoneme recognition task (English), please share some of your experiments. Many thanks !