Rikorose / DeepFilterNet

Noise supression using deep filtering
https://huggingface.co/spaces/hshr/DeepFilterNet2
Other
2.17k stars 203 forks source link

Running into this error while trying to train on our data. #474

Open macso-vincent-russell opened 7 months ago

macso-vincent-russell commented 7 months ago

$ python df/train.py data-hdf5/dataset.cfg data-hdf5/ base_dir/ ... 2023-12-06 02:40:53 | INFO | DF | Start train epoch 2 with batch size 1 thread 'DataLoader Worker 1' panicked at 'assertion failed: k <= self.len()', /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/slice/mod.rs:3420:9 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Aborted (core dumped)

enricguso commented 4 months ago

same here, could you find the cause?

enricguso commented 4 months ago

In my case is something related to the RIRs because when p_reverb=0.0 it trains normally but when p_reverb=1.0 it gets stuck and killed with the error message above. Trace seems normal:

2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_07966.wav with shape [1, 20305]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 270566 with seed 270566
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 270566 with seed 270566, snr 5, gain -6
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_Zy0goYEHPHU.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_31149.wav with shape [1, 24812]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_clean_fullband_read_speech_book_02509_chp_0002_reader_03315_40_seg_1.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_ay2X87w6Dxw.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_11673.wav with shape [1, 98547]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_55381.wav with shape [1, 6963]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_GAc5dEFDkac.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_lt7jAlr_Er0.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_SbJmk_6PVWg.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 272373 with seed 272373
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 272373 with seed 272373, snr 5, gain 0
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 365896 with seed 365896
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 365896 with seed 365896, snr 5, gain 6
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_VX2czCvwQG0.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_lZW6oaScJPc.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:augmentations:555 | Augmentation RandClipping (c: 0.3069719)
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_squeak_squeakyChair_Freesound_validated_379901_0.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 28207 with seed 28207
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 28207 with seed 28207, snr 40, gain 6
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_clean_fullband_german_speech_CC_BY_SA_4.0_249hrs_339spk_German_Wikipedia_16k_German_Wikipedia_Schlosspark_Nymphenburg_audio_48kHz_seg_7.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_door_Freesound_validated_406193_0.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_24170.wav with shape [1, 49323]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1275 | Sampled RIR .._.._guso_in24_rirs_train_recsourcedirectivityHA_right_recsourcedirectivityHA_right_31962.wav with shape [1, 35320]
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_F0IYjZN8ojA.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_TG7zqe3C7yw.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_noise_fullband_RG2sjK0Zsng.wav with codec PCM
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 17304 with seed 17304
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 17304 with seed 17304, snr 0, gain 0
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataloader:279 | Worker: Getting sample 112016 with seed 112016
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1219 | get_sample() idx 112016 with seed 112016, snr 20, gain 0
2024-02-12 13:33:55 | TRACE    | libdfdata.torch_dataloader:df:dataset:1071 | Loaded sample .._.._DNS-Challenge_datasets_fullband_clean_fullband_read_speech_book_04432_chp_0002_reader_10614_109_seg_2.wav with codec PCM

I cannot find anything weird, I assume that the problem comes from the next RIR the dataloader is trying to load. I also have tried to check all my RIRs one by one in python, loading with soundfile and with the following tests in numpy:

Any ideas on what could be causing this?

enricguso commented 4 months ago

Also, as pointed out by the OP, the bug might appear in epoch>0, so it has to be related with particular combinations of speech and RIRs.

enricguso commented 3 months ago

@macso-vincent-russell could you find the cause?

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

enricguso commented 1 week ago

Small update: the issue still persists when trying to train using the meta SoundSpaces RIR dataset. Using the default DFN3 recipe but changing p_reverb to 1.0:

2024-07-03 11:52:51 | INFO | DF | Start train epoch 0 with batch size 16
2024-07-03 11:53:39 | INFO | DF | [0] [ 0/27346] | loss: 10.75046 | t_sample: 4.41516 | t_ba│ tch: 4.42887 | lr: 1.000E-04 | wd: 1.000E-12
thread 'DataLoader Worker 11' panicked at 'assertion failed: k <= self.len()', /rustc/5680fa18feaa87f│ 3ff04063800aec256c3d4b4be/library/core/src/slice/mod.rs:3420:9
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
Aborted (core dumped)