YuanGongND / whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
BSD 2-Clause "Simplified" License
318 stars 25 forks source link

Using custom trained whisper-at model #28

Open himacks opened 5 months ago

himacks commented 5 months ago

Hi Yuan,

I custom trained my own model using your training recipe. I followed instructions and successfully ran run_as_full_train.sh with my custom dataset. However, I'm having trouble importing the model into whisper-at and testing it on audio in the whisper-at library.

Thanks

YuanGongND commented 5 months ago

I custom trained my own model using your training recipe. I followed instructions and successfully ran run_as_full_train.sh with my custom dataset.

I assume your label set is different from AudioSet (i.e., 527 classes), is that correct? If so, your last linear layer would be new.

However, I'm having trouble importing the model into whisper-at and testing it on audio in the whisper-at library.

You can download this repo, and install pip install -e path_to_repo, and then you can edit the code in the repo, including changing the model path, so that you can replace our model with your model.

-Yuan