steveseguin / vdo.ninja

VDO.Ninja is a powerful tool that lets you bring remote video feeds into OBS or other studio software via WebRTC.
https://vdo.ninja
Other
2.81k stars 805 forks source link

PCM + tuned AEC? #492

Open briggeml opened 3 years ago

briggeml commented 3 years ago

It is possible now (if not, is it planned?) to use custom audio codec (uncompressed PCM ideally)? How about a more precise AEC control? Something more than &aec &autogain &denoise.

steveseguin commented 3 years ago

When you mention more precise AEC, what do you mean? I've begun work on a custom Chromium browser that I hope will let me do things that are not possible otherwise.

You can take a look at the current AEC code and see if there are things you'd want control over: https://chromium.googlesource.com/external/webrtc/+/3f08dc656dc22edf658a8393b5b03a46b23aa4e8/webrtc/modules/audio_processing/aec/echo_cancellation.cc

The browser support the follow codecs https://developer.mozilla.org/en-US/docs/Web/Media/Formats/WebRTC_codecs: Opus, G.711 PCM (A-law), G.711 PCM (µ-law), G.722, iLBC[1], iSAC[2]

I can try to add support for one of these if you can review and decide which one. Please note that PCM over WebRTC might have limits that will prevent it from being any good. Any reason OPUS isn't working for you?

briggeml commented 3 years ago

When you mention more precise AEC, what do you mean? I've begun work on a custom Chromium browser that I hope will let me do things that are not possible otherwise.

I am talking about the possibility of setting more parameters that can affect the end result. Now WebRTC echo cancellation works more or less well only with cardioid mics in a narrow direction and a rather specific mic arrangement scheme (in the case of music performance) In any case, the difference is huge with hardware AEC solutions.

You can take a look at the current AEC code and see if there are things you'd want control over: https://chromium.googlesource.com/external/webrtc/+/3f08dc656dc22edf658a8393b5b03a46b23aa4e8/webrtc/modules/audio_processing/aec/echo_cancellation.cc

It's hard to say this way. I'm not a C expert. At least we can try to change some static int variables or some "defines" in aec_core.h It would also be nice to somehow evaluate the AEC quality not only by ears, maybe by displaying some metrics ERL, ERLE, A_NLP?

The browser support the follow codecs https://developer.mozilla.org/en-US/docs/Web/Media/Formats/WebRTC_codecs: Opus, G.711 PCM (A-law), G.711 PCM (µ-law), G.722, iLBC[1], iSAC[2]

Unfortunately, there is no support for LPCM, as I understood. So.. Chromium custom build needed?

I can try to add support for one of these if you can review and decide which one. Please note that PCM over WebRTC might have limits that will prevent it from being any good. Any reason OPUS isn't working for you?

The quality of Opus codec is often insufficient at WebRTC meetings (music lessons) and subsequent post-production work with recordings. If PCM is too bandwidth-intensive, some other lossless codecs (FLAC, Wavpack?) can be used. Here once was an article of adapting the Wavpack for real-time audio apps: https://www.ibr.cs.tu-bs.de/users/kurtisi/public/papers/WavPack_icme08.pdf

steveseguin commented 3 years ago

I've created another ticket for this in the electron Capture app: https://github.com/steveseguin/electroncapture/issues/36

This is likely going to be lower priority if I can't do it via Javascript and need to do it a lower C++ code level. There's just more important things, like hardware encoding, to worry about there first

guest271314 commented 3 years ago

If I understand the requirement correctly a MediaStreamTrack of kind audio can be connected to an AudioWorket node, or a MediaStreamTrackProcessor can be used on Chromium to get raw PCM of a live stream.

steveseguin commented 3 years ago

I found this repo recently, https://github.com/steveseguin/microphone-stream, by a certain someone. ahem. :) I also found another way to stream PCM, where instead of streaming chunked audio/video to disk like I currently do, I can stream it over data channels in a chunked fashion. Not the lowest latency though.

Thank you for the comment