facebookresearch / denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Other
1.62k stars 299 forks source link

Question: file input and output? #122

Open buzzdev opened 2 years ago

buzzdev commented 2 years ago

Is there any way to let this read the input from an audio file and write the output to another file? Thank you

qalabeabbas49 commented 2 years ago

yes, you can use the following code. It will enhance/denoise all the files in noisy_dir and output the clean files in out_dir.

python -m denoiser.enhance --model_path=<path to the model> --noisy_dir=<path to the dir with the noisy files> --out_dir=<path to store enhanced files>

nguyenvulong commented 1 year ago

I was able to run it without --model_path. It worked well. Thank you. This should be on the README.md.

it seems like the pretrained model was loaded somewhere .. if anyone was aware of it please let me know.

❯ python -m denoiser.enhance --noisy_dir=./noisy --out_dir=./nonoisy
INFO:denoiser.pretrained:Loading pre-trained real time H=48 model trained on DNS.
INFO:__main__:Generate enhanced files | 1/1 | 50.8 it/sec
Waiting for pending jobs...
INFO:__main__:Generate enhanced files | 1/1 | 1.7 it/sec