YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.13k stars 212 forks source link

OSError: ./exp/test-esc50-f10-t10-impTrue-aspTrue-b48-lr1e-5/fold1/result.csv not found.(ESC-50 Recipe) #54

Closed poult-lab closed 2 years ago

poult-lab commented 2 years ago

Gongyuan先生您好, 首先多谢谢你的工作。但是在我运行 bash run_esc.sh 文件的时候有一点儿问题所以想向您请教。 run_esc.sh 的最后一行是 python ./get_esc_result.py --exp_path ${base_exp_dir} 当程序运行到最后一行的时候, 程序21行告诉我没有发现fold1/result.csv这个文件。所以我想请教一下这个文件是需要自己创建一下么还是用您上传的文件(ast-master/egs/esc50/exp/test-esc50-f10-t10-pTrue-b48-lr1e-5/result.csv). 我上传的图片是错误的blog描述。 wrong blog

YuanGongND commented 2 years ago

Hi there,

The function of get_esc_result.py is to summarize the results of each fold of the 5-fold cross-validation. So the actual error is not here, it is likely due to an error in data downloading or training.

According to your error message, can you change the base_exp_dir in run_esc.sh (as in the screenshot, it says there's a dir already exists) and re-run it? If there's an error again, please paste the entire error message.

I would also suggest checking if https://github.com/YuanGongND/ast/blob/7b2fe7084b622e540643b0d7d7ab736b5eb7683b/egs/esc50/run_esc.sh#L37 has been properly executed by checking if the audio files are downloaded.

-Yuan

YuanGongND commented 2 years ago

You don't need to create result.csv or anything else, the recipe is self-contained.

poult-lab commented 2 years ago

Hello Mr Gong, because I followed your suggestion, I have figured out this problem already. The bug is from training. Actually, my GPU is RTX 3060(Memory Size:12 G), I had got this error (the image is below)when I run "run.py" from "run_esc.sh" the limitation of GPU. So I have to add this command at torch.cuda.empty_cache() at line 178 from "ast_models.py" for alleviating this "Congenital deficiency". But still has the same issue, I have to reduce bath_size from 48 to 2. Finally, the scripts can be working properly.

But I still have a small request, do you know another way to mitigate this issue? Because I greatly reduce the batch_size arbitrarily, I think my way is little dummy.

YuanGongND commented 2 years ago

To lower the computational overhead, the simplest way is to change fstride and tstride at https://github.com/YuanGongND/ast/blob/7b2fe7084b622e540643b0d7d7ab736b5eb7683b/egs/esc50/run_esc.sh#L33-L34

to larger values (e.g., 16), it is not recommended to set it as values larger than 16 though as it will be larger than the patch size. AST is O(n^2) where n is the sequence length, so smaller fstride and tstride can lower the complexity significantly with a minor performance drop.

Other solutions include turn off audioset_pretrain and set model_size='tiny224' at https://github.com/YuanGongND/ast/blob/7b2fe7084b622e540643b0d7d7ab736b5eb7683b/src/run.py#L91.

-Yuan

poult-lab commented 2 years ago

I am super appreciate it, thank you so much Mr Gong.

poult-lab commented 2 years ago

HELLO Mr Gong, sorry to bother you again. For the "ESC-50 Recipe", in script of "run_esc.sh" could I only run fold 4 individually. the code is below: ` fold=4 echo 'now process fold'${fold}

exp_dir=${base_exp_dir}/fold${fold}

tr_data=./data/datafiles/esc_traindata${fold}.json te_data=./data/datafiles/esc_evaldata${fold}.json

CUDA_CACHE_DISABLE=1 python -W ignore ../../src/run.py --model ${model} --dataset ${dataset} \ --data-train ${tr_data} --data-val ${te_data} --exp-dir $exp_dir \ --label-csv ./data/esc_class_labels_indices.csv --n_class 50 \ --lr $lr --n-epochs ${epoch} --batch-size $batch_size --save_model False \ --freqm $freqm --timem $timem --mixup ${mixup} --bal ${bal} \ --tstride $tstride --fstride $fstride --imagenet_pretrain $imagenetpretrain --audioset_pretrain $audiosetpretrain` Does it have any effect on result? In other word, Can we train each fold separately?

YuanGongND commented 2 years ago

Hi there,

I don't see a reason to do so, 5-fold cross-validation is the standard way to report the results on ESC-50, ESC-50 has some harder folds and some easier folds, so I'd recommend using the standard 5-fold CV. Nevertheless, I am not the author of ESC-50, you can consult the authors to see what's their idea.

-Yuan

poult-lab commented 2 years ago

Thanks for your reply, I think I can fix it. Thank you.