xiph / rnnoise

Recurrent neural network for audio noise reduction
BSD 3-Clause "New" or "Revised" License
3.97k stars 890 forks source link

5000000*87, what does (87-42) mean? #152

Open maggie0830 opened 3 years ago

maggie0830 commented 3 years ago

As we know, we use the denoise.c get 42 features, but when we run "denoise_training speech.pcm noise.pcm " we get 5000000*87 feature matrix. And including the 42 feature infeature matrix, what dose the (87-42) mean? thanks

maggie0830 commented 3 years ago

42 features extraction + 22 expected gain +22 noise logarithmic spectrum+1 vad=87, so that we get 5000000*87 matrix.

mysteryjeans commented 3 years ago

There isn't much I can find on training. I want to train for 8K narrow band and interested to train on MS-SNSD dataset combine with various noise dataset shared on demo.

Can you help with following questions?

  1. Do I need to downsample all dataset to 8KHz or I just need to downsample dataset to match with 16KHz samples in MS-SNSD?
  2. signal.raw mentioned in TRAINING-README should contain noise? or I need to use clean audio files provided MS-SNSD clean_train folder as is?
  3. denoise_training takes only one signal.raw and noise.raw. so how can I run it on multiple files since it overrides every time training.f32? do I need to combine all clean audio files in one and noises in another audio file?
RXAldreezee commented 3 years ago

Hi, what is the function of denoise_training file? I can't seem to open it to check out its function. Please, thanks.