YuanGongND / psla

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
BSD 3-Clause "New" or "Revised" License
139 stars 16 forks source link

prep_fsd.py problems #14

Open aandreysr opened 1 year ago

aandreysr commented 1 year ago

I'm following the step-by-step implementation of PSLA here on GitHub, but when I run 'python3 prep_fsd.py,' it creates the folders FSD50K.dev_audio_16k and FSD50K.eval_audio_16k at the specified dataset path. However, it doesn't generate the converted audios inside the respective folders. Any idea what might be happening?

P.S.: The terminal indicates that the samples were created, but they were not. The data path is defined as fsd_path = './dataset/', and this is the folder structure:

dataset | |--FSD50K.dev_audio |--FSD50K.doc |--FSD50K.eval_audio |--FSD50K.ground_truth |--FSD50K.metadata

YuanGongND commented 1 year ago

do you have sox installed? if no, please do so.

It isn't complex, you can debug it. https://github.com/YuanGongND/psla/blob/76aedd19ad3123be9c5d002809575955683aaade/egs/fsd50k/prep_fsd.py#L22-L36

aandreysr commented 1 year ago

I think I got a better indication of what might be wrong. When running the run.sh, it calls ../../src/run.sh and shows the following error, despite having installed the requirements.txt. I'm using Python 3.7.7 via SSH

+ export TORCH_HOME=./
+ TORCH_HOME=./
+ att_head=4
+ model=efficientnet
+ psla=True
+ eff_b=2
+ batch_size=24
+ '[' True == True ']'
+ impretrain=True
+ freqm=48
+ timem=192
+ mixup=0.5
+ bal=True
+ lr=5e-4
+ p=mean
+ '[' mean == median ']'
+ trpath=./datafiles/fsd50k_tr_full_type1_2_mean.json
+ epoch=40
+ wa_start=21
+ wa_end=40
+ lrscheduler_start=10
+ exp_dir=./exp/demo-efficientnet-2-5e-4-fsd50k-impretrain-True-fm48-tm192-mix0.5-bal-True-b24-lemean-2
+ mkdir -p ./exp/demo-efficientnet-2-5e-4-fsd50k-impretrain-True-fm48-tm192-mix0.5-bal-True-b24-lemean-2
+ CUDA_CACHE_DISABLE=1
+ python ../../src/run.py --data-train ./datafiles/fsd50k_tr_full_type1_2_mean.json --data-val ./datafiles/fsd50k_val_full.json --data-eval ./datafiles/fsd50k_eval_full.json --exp-dir ./exp/demo-efficientnet-2-5e-4-fsd50k-impretrain-True-fm48-tm192-mix0.5-bal-True-b24-lemean-2 --n-print-steps 1000 --save_model True --num-workers 32 --label-csv ./class_labels_indices.csv --n_class 200 --n-epochs 40 --batch-size 24 --lr 5e-4 --model efficientnet --eff_b 2 --impretrain True --att_head 4 --freqm 48 --timem 192 --mixup 0.5 --bal True --lr_patience 2 --dataset_mean -4.6476 --dataset_std 4.5699 --target_length 3000 --noise False --metrics mAP --warmup True --loss BCE --lrscheduler_start 10 --lrscheduler_decay 0.5 --wa True --wa_start 21 --wa_end 40
Traceback (most recent call last):
  File "../../src/run.py", line 9, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'
YuanGongND commented 1 year ago

This is a different problem, you would need to install the dependencies, see https://github.com/YuanGongND/psla#getting-started.

The previous issues is not torch related, have you checked sox?

aandreysr commented 1 year ago

The previously mentioned issue, now it creates the folders, and only a few audios are being converted in the folder FSD50K.dev_audio_16k. When I run the network, it displays the error below, and the folder FSD50K.eval_audio_16k remains empty.

 File "/home/andrey/mestrado/venv-psla/lib/python3.7/site-packages/torchaudio/backend/sox_backend.py", line 35, in load
    raise OSError("{} not found or is a directory".format(filepath))
OSError: ./dataset/FSD50K.dev_audio_16k/35034.wav not found or is a directory