Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
The code there is a sinc low pass filter. The ideal sinc low pass filter has an infinite impulse response, so this is a finite impulse response approximation with windowing to reduce artifacts. See https://en.wikipedia.org/wiki/Sinc_filter for more info.
Hello,sorry to disturb you. I read the paper and code, but still confused about the BandMask and of the DataAugment in denoiser.augment.py.
Here, i don't understand dsp.LowPassFilters function, Why is it calculated so?
Hopefully to hear from you.