YuanGongND / whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
BSD 2-Clause "Simplified" License
318 stars 25 forks source link

the question about finetune whisper #30

Open LithiumZhou opened 3 months ago

LithiumZhou commented 3 months ago

Hi Yuan,

I'm very sorry to disturb you again. I really want to know how to fine-tune Whisper for Audioset and ESC-50.

YuanGongND commented 3 months ago

Our code freeze the backbone and tune a few layers on top of its intermediate representations.

In the paper. we did try to finetune the entire model, which leads to slightly better result. The tradeoff is the finetuned Whisper would lose its ASR ability.

-Yuan