Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Hi,
I'm using your pre-trained model to enhance audio with GPU. In particular, I run:
!python3 -m denoiser.enhance --noisy_dir=noisy/700to1300/ --out_dir=cleaned/noisy --device="cuda" --sample_rate 22050
I set --device="cuda" but I got this error:
INFO:denoiser.pretrained:Loading pre-trained real time H=48 model trained on DNS.
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 156, in
enhance(args, local_out_dir=args.out_dir)
File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 143, in enhance
estimate = get_estimate(model, noisy_signals, args)
File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 68, in get_estimate
estimate = model(noisy)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, kwargs)
File "/content/drive/My Drive/denoiser/denoiser/demucs.py", line 176, in forward
x = encode(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, *kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 298, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 295, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 4.11 GiB (GPU 0; 15.90 GiB total capacity; 12.99 GiB already allocated; 2.00 GiB free; 13.00 GiB reserved in total by PyTorch)
I run on google-colab pro, but I don't run any other session. I don't know why.
I also set batch size to 1, num_workers=1.
Pls, help me!
Thanks!
Hi, I'm using your pre-trained model to enhance audio with GPU. In particular, I run: !python3 -m denoiser.enhance --noisy_dir=noisy/700to1300/ --out_dir=cleaned/noisy --device="cuda" --sample_rate 22050
I set --device="cuda" but I got this error:
INFO:denoiser.pretrained:Loading pre-trained real time H=48 model trained on DNS. Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 156, in
enhance(args, local_out_dir=args.out_dir)
File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 143, in enhance
estimate = get_estimate(model, noisy_signals, args)
File "/content/drive/My Drive/denoiser/denoiser/enhance.py", line 68, in get_estimate
estimate = model(noisy)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, kwargs)
File "/content/drive/My Drive/denoiser/denoiser/demucs.py", line 176, in forward
x = encode(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, *kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 298, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 295, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 4.11 GiB (GPU 0; 15.90 GiB total capacity; 12.99 GiB already allocated; 2.00 GiB free; 13.00 GiB reserved in total by PyTorch)
I run on google-colab pro, but I don't run any other session. I don't know why. I also set batch size to 1, num_workers=1. Pls, help me! Thanks!