kkoutini / PaSST

Efficient Training of Audio Transformers with Patchout
Apache License 2.0
287 stars 48 forks source link

OpenMic fine-tuned model? #26

Closed turian closed 1 year ago

turian commented 1 year ago

Do you mind releasing the OpenMic fine-tuned model? So OpenMic style predictions can be made out of the box, without any training?

kkoutini commented 1 year ago

Hi Joseph, you can download the fine-tuned model on openmic here : openmic-passt-s-f128-10sec-p16-s10-ap.85.pt

Thanks to the HEAR api, loading and using the model is straight-forward:

!pip install git+https://github.com/kkoutini/passt_hear21.git
import torch

from hear21passt.base import  get_basic_model
from hear21passt.models.passt import get_model as get_model_passt
model = get_basic_model(mode="logits").cuda()
# replace the transformer for the  20 classes output ( OpenMic classes) 
model.net = get_model_passt(arch="passt_s_swa_p16_128_ap476", n_classes=20).cuda()

# loading the pretrained model
state = torch.load("/path/to/openmic-passt-s-f128-10sec-p16-s10-ap.85.pt")
model.net.load_state_dict(state)

logits = model(wave_signal)
turian commented 1 year ago

Thank you!