kkoutini / PaSST

Efficient Training of Audio Transformers with Patchout
Apache License 2.0
287 stars 48 forks source link

.net and .net_swa parameters in .ckpt file #46

Open Rishabh-S1899 opened 3 months ago

Rishabh-S1899 commented 3 months ago

We have finetuned the passt_s_swa_p16_s16_128_ap473 model on Dcase 2020 dataset for scene classificiation. Now we are trying to use the finetuned model by loading params from ckpt file using state dictionary. But it says it has two types of params .net and.net_swa. Which params are we supposed to use for the architecture

kkoutini commented 3 months ago

Hi, the .net is the trained model. . net_swa contains the Stochastic Weight Averaging of the model during training. In some cases, the average prevents overfitting, so I recommend checking both on the task.