fschmid56 / EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
MIT License
218 stars 41 forks source link

error when train with "python ex_audioset.py --cuda --train" #9

Closed mmuguang closed 1 year ago

mmuguang commented 1 year ago

image

fschmid56 commented 1 year ago

Hi! There could be two sources of problems:

mmuguang commented 1 year ago

I just found that the fname_toindex has 1893696 clips but my unbalance and balance set has 1900000+ clips. I downloaded the audioset from PANN in this link https://pan.baidu.com/s/13WnzI1XDSvqXZQTS-Kqujg. I have removed those clips not in fname_to_index and it now can train normally.

mmuguang commented 1 year ago

Another question is that how u train the [mn40_as_no_im_pre_mAP_483.pt]. I use this command "python ex_audioset.py --cuda --train --model_width=4.0" ,but the mAP is around 0.42

fschmid56 commented 1 year ago

The code base should be fine in general. I recently trained several models and it worked out as expected. I'll try to train a model with a width of 4.0 and tell you the exact command.

fschmid56 commented 1 year ago

For me it looks like it works fine, e.g. you can follow the experiment using the command:

python ex_audioset.py --cuda --train --model_width=4.0 --batch_size=60 --max_lr=0.0004

Here is the experiment on W&B, currently running.