Closed SteveTanggithub closed 1 year ago
The paper config is indeed the default config used in ex_audioset.py.
Train an ImageNet pre-trained model on AudioSet with the paper config:
python ex_audioset.py --cuda --train --pretrained_name=mn10_im_pytorch
Train a model on AudioSet with the paper config from scratch:
python ex_audioset.py --cuda --train
Technically it is possible to use audio with a different sampling rate. The code in this repo allows to use 16k and 8k. However, I can't make statements about the expected performance, as I only trained models with 32k sampling rate.
could u provide the all config detail for reproducing the paper results? For example, the model with pretrained and without pretrained version. By the way, must i use the audio with resample rate 32k?