Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
when i try to train model with DNS dataset, i meet
Traceback (most recent call last):
File "train.py", line 105, in main
_main(args)
File "train.py", line 99, in _main
run(args)
File "train.py", line 53, in run
args.dset.train, length=length, stride=stride, pad=args.pad, **kwargs)
File "/home/duonglh7/Downloads/Duong/denoiser/denoiser/data.py", line 93, in init
assert len(self.clean_set) == len(self.noisy_set)
AssertionError
when i try to train model with DNS dataset, i meet Traceback (most recent call last): File "train.py", line 105, in main _main(args) File "train.py", line 99, in _main run(args) File "train.py", line 53, in run args.dset.train, length=length, stride=stride, pad=args.pad, **kwargs) File "/home/duonglh7/Downloads/Duong/denoiser/denoiser/data.py", line 93, in init assert len(self.clean_set) == len(self.noisy_set) AssertionError