JonathanFL commented 1 year ago

Hi Yuan,

I am trying to make the ESC-50 Recipe work on my laptop with one GPU. The outout from when executing run_esc50.sh:

current #epochs=1, #steps=0
I am process 18304, running on uname_result(system='Windows', node='jonathanspc', release='10', version='10.0.22621', machine='AMD64'): starting (Sun Dec 18 17:21:18 2022)
now train a audio spectrogram transformer model
balanced sampler is not used
now train a audio spectrogram transformer model
balanced sampler is not used
FileExistsError: [WinError 183] Cannot create a file when that file already exists: './exp/test-esc50-f10-t10-impTrue-aspTrue-b48-lr1e-5/fold1/models'

Also, in the run method it uses os.uname()[1], which does not exist on my Windows PC, so I have changed it to import platform platform.uname().

Can you help with this?

YuanGongND commented 1 year ago

Hi there,

I am not familiar with Windows, but Cannot create a file when that file already exists: './exp/test-esc50-f10-t10-impTrue-aspTrue-b48-lr1e-5/fold1/models' means you need to either remove this old experiment directory, or change the experiment path specified in https://github.com/YuanGongND/ast/blob/9e3bd9942210680b833b08c39d09f2284ddc4d1d/egs/esc50/run_esc.sh#L48

The reason that I don't place an checking before creating the folder is because I intend to raise an error to avoid new run overwriting the old one.


JonathanFL commented 1 year ago

Hi Yuan,

I don't know why, but first, I had to create the 'exp' folder from the shell script and, afterwards, the 'models' folder. I guess the behaviour of makedirs is different between Windows and your OS.

models_dir = os.path.join(args.exp_dir,"models")
print("Creating models directory: %s" % models_dir)
if not os.path.exists(models_dir):

This still works, because of this part in the shell script

if [ -d $base_exp_dir ]; then
  echo 'exp exist'

But I guess it would not work if not run from the shell script. Anyway, just letting you or anybody else know. Also, I had to modify the shell script for Windows paths, e.g., the python venv path.

On another note, I was unable to train, even with num_worker=0 and batch_size=2, because of the same error as this one.

Could an alternative be to set it up in Google Colab for training with own data? The modified shell script looks like this now:

#SBATCH -p gpu
#SBATCH -x sls-titan-[0-2]
#SBATCH --gres=gpu:4
#SBATCH -c 4
#SBATCH -n 1
#SBATCH --mem=48000
#SBATCH --job-name="ast-esc50"
#SBATCH --output=./log_%j.txt

set -x
# comment this line if not running on sls cluster
#. /data/sls/scratch/share-201907/slstoolchainrc
source ../../venvast/Scripts/activate
export TORCH_HOME=../../pretrained_models

if [ $audiosetpretrain == True ]

python .\\prep_esc50.py

if [ -d $base_exp_dir ]; then
  echo 'exp exist'
mkdir -p $base_exp_dir

  echo 'now process fold'${fold}


  echo 'creating exp dir: '${exp_dir}
  mkdir -p $exp_dir


  CUDA_CACHE_DISABLE=1 python -W ignore ..\\..\\src\\run.py --model ${model} --dataset ${dataset} \
  --data-train ${tr_data} --data-val ${te_data} --exp-dir $exp_dir \
  --label-csv ./data/esc_class_labels_indices.csv --n_class 50 \
  --lr $lr --n-epochs ${epoch} --batch-size $batch_size --save_model False \
  --freqm $freqm --timem $timem --mixup ${mixup} --bal ${bal} \
  --tstride $tstride --fstride $fstride --imagenet_pretrain $imagenetpretrain --audioset_pretrain $audiosetpretrain

python .\\get_esc_result.py --exp_path ${base_exp_dir}

This "works", but again, I might have too few GPU resources on my laptop(?)

YuanGongND commented 1 year ago

On another note, I was unable to train, even with num_worker=0 and batch_size=2, because of the same error as https://github.com/YuanGongND/ast/issues/54#issuecomment-1073397042.

The model itself takes some GPU memory so it is possible that the code is not runable with batch_size=2. You can use nvidia-smi command to check the GPU memory usage.

Could an alternative be to set it up in Google Colab for training with own data?

It is totally possible, in another irrelavant project, I have a Colab training script. Nonetheless, I don't have a plan to build one for this project as Colab can only train with small datasets.

JonathanFL commented 1 year ago

Alright. Thank you very much.