Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:
When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?
only the linear output layer
last two layers + last transformer block
all layers
Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.
Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help: