Audio parameters from config file not taken into account in the training

liofeu commented 2 weeks ago

Hello Christian

It seems that some audio parameters set in the config file are not taken into account in the training. Instead, the values are the ones set by default at lines 456-463 in audiodataset.py.

For instance, the error below shows that n_fft=4096 (see last line below) while it was set as 1024 in the config file (see first lines of the terminal ouput).

Runned on macOS 10.15.7 / python 3.11.9

Best, Lionel

2024-06-17 14:45:21,545 - training animal-spot - INFO - Config Data: {'sequence_len': '500', 'lr_patience_epochs': '8', 'fmin': '500', 'max_pool': '2', 'min_max_norm': '', 'checkpoint_dir': '"/Users/lionel/database/checkpoint_dir"', 'conv_kernel_size': '7', 'noise_dir': '"/Users/lionel/database/noise_dir"', 'data_dir': '"/Users/lionel/database/ANIMAL-SPOT/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS"', 'early_stopping_patience_epochs': '20', 'max_train_epochs': '1', 'lr': '10e-5', 'log_dir': '"/Users/lionel/database/log_dir"', 'filter_broken_audio': '', 'summary_dir': '"/Users/lionel/database/summary_dir"', 'src_dir': '"/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"', 'lr_decay_factor': '0.5', 'model_dir': '"/Users/lionel/database/model_dir"', 'beta1': '0.5', 'num_workers': '0 # 8', 'fmax': '10000', 'resnet': '18', 'batch_size': '16', 'n_freq_bins': '256', 'start_from_scratch': '', 'cache_dir': '"/Users/lionel/database/cache_dir"', 'augmentation': '', 'freq_compression': 'linear', 'epochs_per_eval': '2', 'num_classes': '2', 'sr': '44100', 'n_fft': '1024', 'debug': '', 'hop_length': '172'}
2024-06-17 14:45:21,545 - training animal-spot - INFO - Training Command: python3 -W ignore::UserWarning "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT"/main.py --sequence_len 500 --lr_patience_epochs 8 --fmin 500 --max_pool 2 --min_max_norm --checkpoint_dir "/Users/lionel/database/checkpoint_dir" --conv_kernel_size 7 --noise_dir "/Users/lionel/database/noise_dir" --data_dir "/Users/lionel/database/ANIMAL-SPOT/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS" --early_stopping_patience_epochs 20 --max_train_epochs 1 --lr 10e-5 --log_dir "/Users/lionel/database/log_dir" --filter_broken_audio --summary_dir "/Users/lionel/database/summary_dir" --lr_decay_factor 0.5 --model_dir "/Users/lionel/database/model_dir" --beta1 0.5 --num_workers 0 # 8 --fmax 10000 --resnet 18 --batch_size 16 --n_freq_bins 256 --start_from_scratch --cache_dir "/Users/lionel/database/cache_dir" --augmentation --freq_compression linear --epochs_per_eval 2 --num_classes 2 --sr 44100 --n_fft 1024 --debug --hop_length 172
2024-06-17 14:45:21,545 - training animal-spot - INFO - Start Training!!!
14:45:24|I|Setting up model
14:45:24|I|Found csv files in /Users/lionel/database/ANIMAL-SPOT/ANIMAL-SPOT_FILES/ANIMAL-SPOT_GUIDE/EXAMPLE_DATA_CORPUS
14:45:24|I|Model predict 2 classes
14:45:24|I|Init dataset train...
14:45:27|I|Init dataset val...
14:45:30|I|Init dataset test...
14:45:33|I|Init summary writer
14:45:33|I|Init model on device 'cpu'
14:45:33|I|No checkpoints found in /Users/lionel/checkpoint_dir
14:45:33|I|Class Distribution: {'noise': 0, 'target': 1}
14:45:33|I|Start training model binary_classifier
14:45:48|W|Traceback (most recent call last):
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/trainer.py", line 158, in fit
    self.train_epoch(
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/trainer.py", line 247, in train_epoch
    for i, (features, label) in enumerate(train_loader):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/data/audiodataset.py", line 599, in __getitem__
    sample = self.t_spectrogram(file)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/data/transforms.py", line 47, in __call__
    x = t(x)
        ^^^^
  File "/Users/lionel/ANIMAL-SPOT-master/ANIMAL-SPOT/data/transforms.py", line 117, in __call__
    S = torch.stft(
        ^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/functional.py", line 660, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: stft(torch.FloatTensor[1, 3397], n_fft=4096, hop_length=441, win_length=4096, window=torch.FloatTensor{[4096]}, normalized=0, onesided=1, return_complex=0) : expected 0 < n_fft < 3397, but got n_fft=4096

14:45:48|W|Aborting...

ChristianBergler commented 2 weeks ago

Hello,

i can not reproduce the error - i tried everything on my end (linux mint 21.2) with the current version and if i change the stuff in the config file it is also updated in the dataOpts dictionary:

13:54:17|D|dataOpts: { "sr": 44100, "preemphases": 0.98, "n_fft": 1024, "hop_length": 172, "n_freq_bins": 256, "fmin": 500, "fmax": 10000, "freq_compression": "linear", "min_level_db": -100, "ref_level_db": 20 }

Please check your config file whether something is commented out via # or there is any typo and/or wrong parameter set.

liofeu commented 2 weeks ago

Please check your config file whether something is commented out via # or there is any typo and/or wrong parameter set.

I only have one n_fft which is set to 1024 and during the training, the first line of terminal output does confirm that n_fft was saved as 1024 (see the terminal output which I copy-pasted in my first message).

My colleague cannot reproduce the bug on Windows.

I will try to investigate further later, but my Python skills are limited. I will let you know if I find anything.

ChristianBergler commented 2 weeks ago

If your colleauge can not reproduce the error with the n_fft and on my end it is also fine it is very likely that you have some issues on your end. Fingers crossed you will figure it out, what went wrong...

ChristianBergler / ANIMAL-SPOT

Audio parameters from config file not taken into account in the training #12