error when train with "python ex_audioset.py --cuda --train"

fschmid56 / EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

MIT License

218 stars 41 forks source link

error when train with "python ex_audioset.py --cuda --train" #9

Closed mmuguang closed 1 year ago

mmuguang commented 1 year ago

fschmid56 commented 1 year ago

Hi! There could be two sources of problems:

Check if the ID is in the official csv files. The keys in the dict _fname_to_index__ correspond to the Youtube IDs.
I pushed a fix to a problem I'm aware of - AudioSet is downloaded from Youtube, so there could be a small portion of files I don't have in my version of the dataset. The IDs of files I couldn't download successfully are not in the dict fname_to_index and I can't provide teacher predictions for them. In the version I uploaded, if a file ID is not found in fname_to_index the distillation loss is set to 0 for the corresponding sample.

mmuguang commented 1 year ago

I just found that the fname_toindex has 1893696 clips but my unbalance and balance set has 1900000+ clips. I downloaded the audioset from PANN in this link https://pan.baidu.com/s/13WnzI1XDSvqXZQTS-Kqujg. I have removed those clips not in fname_to_index and it now can train normally.

mmuguang commented 1 year ago

Another question is that how u train the [mn40_as_no_im_pre_mAP_483.pt]. I use this command "python ex_audioset.py --cuda --train --model_width=4.0" ,but the mAP is around 0.42

fschmid56 commented 1 year ago

The code base should be fine in general. I recently trained several models and it worked out as expected. I'll try to train a model with a width of 4.0 and tell you the exact command.

fschmid56 commented 1 year ago

For me it looks like it works fine, e.g. you can follow the experiment using the command:

python ex_audioset.py --cuda --train --model_width=4.0 --batch_size=60 --max_lr=0.0004

Here is the experiment on W&B, currently running.