Significative delay in real time mic input

xiph / rnnoise

Recurrent neural network for audio noise reduction

BSD 3-Clause "New" or "Revised" License

4.03k stars 890 forks source link

Significative delay in real time mic input #33

Open loretoparisi opened 6 years ago

loretoparisi commented 6 years ago

@JanX2 @jmvalin When doing real-time denoise from mic, there is a significative delay of 1-2 sec. How to improve this with the current pre-trained model?

jmvalin commented 6 years ago

The code itself causes exactly 10 ms delay. Anything else you see comes from your implementation (most likely how you access the soundcard).

loretoparisi commented 6 years ago

@jmvalin thanks for the details. I'm trying in this case the AudioContext in the browser and your model converted via emscripten.

mbebenita commented 6 years ago

@loretoparisi the online demo purposely inserts a delay to make it easier to listen to the denoised output.

loretoparisi commented 6 years ago

@mbebenita ah so! that makes sense then!!! So it's something on the output buffer?

mbebenita commented 6 years ago

@loretoparisi you can probably tweak the bufferSize in https://people.xiph.org/~jm/demo/rnnoise/record.js

RNNoise is pretty fast, the reason we did this in the demo was because it was impossible to hear your own speech being denoised as you were taking.

venkat-kittu commented 5 years ago

@loretoparisi @mbebenita @jmvalin I have tried realtime in python using sound device library which is giving me 2.5 seconds delay. In this i am running rnnoise as subprocess from python code.

I am very new to javascript can you please tell me steps to do in javascript using emscripten.

Thanks in advance

loretoparisi commented 5 years ago

@venkat-kittu I have asked to the authors here https://github.com/xiph/rnnoise/issues/32 Basically they have a compiled emscripten in the repository, but there is no docs how to generate it

venkat-kittu commented 5 years ago

Then how i can do live in javascript or python?

wegylexy commented 4 years ago

@jmvalin thanks for the details. I'm trying in this case the AudioContext in the browser and your model converted via emscripten.

When using ScriptProcessorNode, its buffer size needs to be at least 512 (minimum allowed is 256) to avoid glitches because it runs on a different thread than the AudioContext. Since 512 > 480, the intrinsic latency becomes 512 + 512 == 1024, i.e. 21.333ms. To reduce the latency by 8ms, use AudioWorkletProcessor which runs on the same thread as the AudioContext. Currently, its buffer size is fixed at 128. Since minimum multiples of 128 not less than 480 is 512 again, the intrinsic latency becomes 512 + 128 == 640, i.e. 13.333ms.

Here is an example in WASM that uses AudioWorkletProcessor when available and falls back to ScriptProcessorNode: https://github.com/wegylexy/rnnoise_wasm/tree/master/src

Demo: https://rnnoise.timtim.hk/demo/

vishaldhull09 commented 3 years ago

@loretoparisi any idea how to convert rnnoise model to use in browser ? like I saw noise.js in their demo but how are they initializinf weights in it if I want it finetune this model and use it there ?

wegylexy commented 3 years ago

@vishaldhull09 See https://github.com/wegylexy/rnnoise_wasm . Set your weights in https://github.com/xiph/rnnoise/blob/master/src/rnn_data.c .

vishaldhull09 commented 3 years ago

@vishaldhull09 See https://github.com/wegylexy/rnnoise_wasm . Set your weights in https://github.com/xiph/rnnoise/blob/master/src/rnn_data.c .

thanks @wegylexy sorry but little new to node and JS so I have this doubt ... https://jmvalin.ca/demo/rnnoise/noise.js as far as i understood , demo for browser is using this file to load Rnnoise any idea how this file is created from the saved model, as I don't want to use node.js

wegylexy commented 3 years ago

@vishaldhull09 It has nothing to do with node.js. That's just a module that can also be loaded in node.js as well as the browser. If you prefer C over JS, compile to WASM like I do https://github.com/wegylexy/rnnoise_wasm/blob/master/src/worklet.c .