xiph / rnnoise

Recurrent neural network for audio noise reduction
BSD 3-Clause "New" or "Revised" License
3.98k stars 890 forks source link

16kHz conversion #37

Open enting0608 opened 6 years ago

enting0608 commented 6 years ago

Thanks for the great project.! I really get a good inspiration on it.

Now I'm trying to convert from 48kHz samplingrate base code to 16kHz sampling rate code.

I change some parameters like followings.

  1. INPUT_FEATURE_LEN : 42 -> 38
  2. OUTPUT_GAIN_LEN : 22 -> 18 It was changed in accordance with eband5ms table. static const opus_int16 eband5ms[] = { /0 200 400 600 800 1k 1.2 1.4 1.6 2k 2.4 2.8 3.2 4k 4.8 5.6 6.8 8k 9.6 12k 15.6 20k/ 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40, 48, 60, 78, 100 };

and also change pitch related factors.

define PITCH_MIN_PERIOD 60 ->20

define PITCH_MAX_PERIOD 768 ->256

define PITCH_FRAME_SIZE 960->320

is it correct? I tried neural network training after changing above params. But, it looks not working.. (there is just audio's gain suppression)

Is there anyone who can answer my question? Thanks.

jmvalin commented 6 years ago

If you make any change to the way the features are computed (which is the case if you change the number of bands), then you need to retrain all the weights. If you just want to use the existing code (and weights) as is, the simplest would just be to resample from 16 kHz to 48 kHz and then back at the end.

enting0608 commented 6 years ago

Thanks for your quick answer! I'm trying to change features, so I need to retrain all weights. So I did, but the result was not good(there is only just audio's gain suppression) My real question is,, is it correct in following way(If I change samplingrate from 48k to 16k)? In denoise.c file

define PITCH_MIN_PERIOD 60 ->20

define PITCH_MAX_PERIOD 768 ->256

define PITCH_FRAME_SIZE 960->320

I'm sorry to bother you. Actually, I'm not familiar with pitch estimation code from opus. I wonder my approach is okay. Thanks.

nerv3890 commented 5 years ago

Thanks for your quick answer! I'm trying to change features, so I need to retrain all weights. So I did, but the result was not good(there is only just audio's gain suppression) My real question is,, is it correct in following way(If I change samplingrate from 48k to 16k)? In denoise.c file

define PITCH_MIN_PERIOD 60 ->20

define PITCH_MAX_PERIOD 768 ->256

define PITCH_FRAME_SIZE 960->320

I'm sorry to bother you. Actually, I'm not familiar with pitch estimation code from opus. I wonder my approach is okay. Thanks.

Hello enting0608 You say u try to convert from 48kHz sampling rate base code to 16kHz sampling rate code You have changed some parameters, but the result was not good I am trying to do the same thing just like you. And I have few questions, if you can discuss with me I will really really appreciate.

  1. Do I have to change the FRAME_SIZE? The default FRAME_SIZE is 480 Do I have to change it to 160? Or if I just want to use the 16kHz file as input and still have the same delay time There is no need to change the FRAME_SIZE?

  2. Have you done this converting sampling rate task? If you have done What parameters should I change exactly?

Thank you!

alokprasad commented 5 years ago

@enting0608 @nerv3890 Where you guys able to Train Rnnoise using 16khz and attain same quality as with 48khz ? What are the changes required.

alokprasad commented 5 years ago

If you make any change to the way the features are computed (which is the case if you change the number of bands), then you need to retrain all the weights. If you just want to use the existing code (and weights) as is, the simplest would just be to resample from 16 kHz to 48 kHz and then back at the end.

@jmvalin what would be changes in the code for if i want to use the code with 16khz input , ( i dont want to resample 16 to 48khz), code changes for both training and inference in case of 16khz

alokprasad commented 5 years ago

There seems to some work done by Gregor with this commit https://github.com/GregorR/rnnoise-nu/commit/53f34de7d95af80c0c9101c791db47a05ec36196 in github.com/GregorR/rnnoise-nu

wegylexy commented 4 years ago

I look forward to something ready-to-use at 8kHz for denoising human speech.

nafeesmahbub commented 2 years ago

Any update on how to fine tune hyperparameter for 8 or 16 k.