yxlu-0102 / MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
MIT License
267 stars 40 forks source link

Error dividing by zero if noisy_audio is silence #33

Open leapway opened 1 month ago

leapway commented 1 month ago

If during training the noisy audio is silence (np.max == 0), the model will have NaN loss. This is one of the ways it can be fixed in dataset.py:

import numpy as np
...
def __getitem__(self, index):
    filename = self.audio_indexes[index]
    if self._cache_ref_count == 0:
        clean_audio, _ = librosa.load(os.path.join(self.clean_wavs_dir, filename + '.wav'), sr=self.sampling_rate)
        noisy_audio, _ = librosa.load(os.path.join(self.noisy_wavs_dir, filename + '.wav'), sr=self.sampling_rate)

        if ( np.max(noisy_audio) == 0.0 ):
            noise = np.random.normal(0,0.00001,len(noisy_audio))
            noisy_audio = np.add(noisy_audio, noise)
...
yxlu-0102 commented 1 month ago

Thanks a million!