Open viveksj opened 4 years ago
Hi Vivek,
For 1 and 2, that is because this algorithm is designed for stationary noise (relatively constant noise).
For 3, you could renormalize after noise reduction. If you set the threshold too low, you're probably going to remove more signal with the noise as well.
For the floating point error, there should probably be a better warning for when you are using 16bit vs 32bit input in noisereduce, I should add that to a future release.
Just saw this, Thank you.
Wasn't aware of the stationary noise; That probably can't work with telephony disturbance.
I'll try 3 again.
Hi @viveksj Did happen to find any solution which works for telephony disturbance?
Is there a way to dynamically clean up audio? Should I use something like a neural network algorithm to train dynamic cleaning of noisy audios?
That's one way of doing it. Stay tuned... I'm also planning on writing a non-stationary noise reduction version of this, which works well for preliminary results when I get some spare time.
Tim, your knowledge is amazing and a project like this is very interesting! I will take a moment to study it better. Just to be clear in my mind would I be able to use this filter in real time (not necessarily recording)?
In case of applying stationary noise reduce effect to an offline (non-realtime) audio signal (mono), I'm losing speech clarity.
Higher values of n_std_thresh_stationary
started dampening the speech (kind of a muffling effect if you will) taking a hit in its crispness, whiltst reducing the stationary noise better (as expected). I've played around with other applicable parameters for the stationary case. Have finally settled on right combination of values for my use case with an acceptiible tradeoff between loss of crispness of speech and removal of noise.
Is there any adjustable parameter that I could have missed, that could make noise removal more aggressive over the noise frequency bands while staying less aggressive or softly smoothening over neighboring bands?
Small pointer, I did not face reduction in volume.
I tested this on multiple audio files (16kHz telephony audio) with prop_decrease ranging from (0,0.1,0.5,0.75,1)
My 3 observations were:
Could you help me identify if there's something wrong with my use of it or if there are any improvements that can be made.
(Also, I got floating point expected error so instead of using wavfile.read(), I used librosa.load(filepath,sr=None))