huggingface / blog

Public repo for HF blog posts
https://hf.co/blog
2.3k stars 712 forks source link

Whisper fine tuning - which layers are trained? #2142

Open chungvle opened 3 months ago

chungvle commented 3 months ago

Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:

  1. When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?
    • only the linear output layer
    • last two layers + last transformer block
    • all layers
  2. Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.