asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
907 stars 86 forks source link

Implement NoiseGate #108

Open iver56 opened 2 years ago

iver56 commented 2 years ago

Like https://en.wikipedia.org/wiki/Noise_gate

Because some voice over ip applications do this

shahules786 commented 2 years ago

Hey @iver56, May I take a look at this?

iver56 commented 2 years ago

Yes

iver56 commented 2 years ago

I thought about it for a little bit and came to realize that there are two forms of noise gating:

One is simple, almost like a compressor, except it turns on and off (maybe with a short fade) the audio based on the loudness. E.g. if nobody is speaking, it will mute the sound. When one starts speaking it unmutes, and mutes again when the speech has ended. Example of this: https://www.gvst.co.uk/ggate.htm

The other one is like a simple form of denoising - it doesn't simply turn on or off the audio, but instead divides the audio into many frequency bands (as in spectrogram) and treats them independently. Then it has a noise threshold and removes anything below that threshold. Example of this: The old noise reduction feature in Audacity: https://manual.audacityteam.org/man/noise_reduction.html

Both can be useful as data augmentation

So maybe those two transforms should have different names