xiph / rnnoise

Recurrent neural network for audio noise reduction
BSD 3-Clause "New" or "Revised" License
3.89k stars 882 forks source link

How to train with large dataset #196

Open Bach1502 opened 2 years ago

Bach1502 commented 2 years ago

Hello, I believe that this is a fairly simple question but since I'm very new to ML in general, it still baffles me. I just followed the training instruction and has successfully trained my model on one pair of data (a clean speech.wav and a noise.wav) now I want to ask how can you repeat this process for larger dataset, I'm currently having a set of data with 300 files for these 2 categories and I don't think repeating this process 300 times is the way I should go.

Thanks.

Zadagu commented 2 years ago

just concatenate the audio files. But you need to be aware, that the input format is not .wav it's plain pcm without any header.

Bach1502 commented 2 years ago

thank you, I will try it to see if it works

ZihCode commented 1 year ago

I want to know how to concatenate the audio files. Did you use any useful tools?Or just copy the RAW files and paste them into one file? How can I get a long RAW data? I would be very grateful if you could help me

Zadagu commented 1 year ago

I wrote a python script to concatenate files. For reading audio files I used the soundfile package and resampled if needed using scipy.

Zadagu commented 1 year ago

Sorry, but I think your behavior in the GitHub issues is somewhat inappropriate. You spammed the very same question three times across multiple issues: https://github.com/xiph/rnnoise/issues/208 https://github.com/xiph/rnnoise/issues/201#issuecomment-1209169089 https://github.com/xiph/rnnoise/issues/196 You can answer your question yourself, by reading the rnnoise paper and newer speech enhancement papers. They all report numbers on how much data they are using.