webrtc / samples

WebRTC Web demos and samples
https://webrtc.github.io/samples
BSD 3-Clause "New" or "Revised" License
13.91k stars 5.7k forks source link

Echo cancellation doesn't work #1243

Open zli18 opened 4 years ago

zli18 commented 4 years ago

Please read first!

Please use discuss-webrtc for general technical discussions and questions.

Note: If the checkboxes above are not checked (which you do after the issue is posted), the issue will be closed.

Browser affected

Browser name including version (e.g. Chrome 64.0.3282.119) Chrome on MacOS: Version 78.0.3904.108 (Official Build) (64-bit) Actually I haven't got it work in any browser, either on Mac or PC.

Description

I am trying to use this sample for echo cancellation, but it doesn't work. https://webrtc.github.io/samples/src/content/getusermedia/record/

Steps to reproduce

  1. On any laptop, unplug any headphone to make sure audio can play from speakers. (I am using a MBP)
  2. Open https://webrtc.github.io/samples/src/content/getusermedia/record/ in Chrome, and check the "echo cancellation" box.
  3. Start playing any YouTube in another Chrome tab. Make sure the audio is played from speaker.
  4. Start to record video using this sample, accept any permissions for camera and microphone.

Expected results

I expect the recorded video should cancel the audio from YouTube significantly when the "echo cancellation" is checked.

Actual results

The audio from YouTube still got recorded and played back in original volume, regardless of whether "echo cancellation" is checked or not. I haven't made it work in any browser, either on Mac or PC.

mattemoore commented 4 years ago

+1 on this.

ahmadhassanch commented 4 years ago

+1 on this

rikzin commented 4 years ago

Echo is the reverberation between the microphone input and speakers whereas youtube is an audio source.

Acoustic Echo Cancellation is an algorithm looking single frequency runaway where the amplitude overdrive the input feedback loop known as the Larson Effect.

On Mon, Nov 25, 2019, 10:25 Z notifications@github.com wrote:

Please read first!

Please use discuss-webrtc https://groups.google.com/forum/#!forum/discuss-webrtc for general technical discussions and questions.

  • I have provided steps to reproduce
  • I have provided browser name and version
  • I have provided a link to the sample here or a modified version thereof

Note: If the checkboxes above are not checked (which you do after the issue is posted), the issue will be closed. Browser affected

Browser name including version (e.g. Chrome 64.0.3282.119) Chrome on MacOS: Version 78.0.3904.108 (Official Build) (64-bit) Actually I haven't got it work in any browser. Description

I am trying to use this sample for echo cancellation, but it doesn't work. https://webrtc.github.io/samples/src/content/getusermedia/record/ Steps to reproduce

  1. On any laptop, unplug any headphone to make sure audio can play from speakers. (I am using a MBP)
  2. Open https://webrtc.github.io/samples/src/content/getusermedia/record/ in Chrome, and check the "echo cancellation" box.
  3. Start playing any YouTube in another Chrome tab. Make sure the audio is played from speaker.
  4. Start to record video using this sample, accept any permissions for camera and microphone.

Expected results

I expect the recorded video should cancel the audio from YouTube significantly when the "echo cancellation" is checked. Actual results

The audio from YouTube still got recorded and played back in original volume, regardless of whether "echo cancellation" is checked or not. I haven't made it work in any browser, either on Mac or PC.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/webrtc/samples/issues/1243?email_source=notifications&email_token=ACPGYXPF3EZDZSWKJY3WVH3QVQKBRA5CNFSM4JRM6BQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H34SX7Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPGYXJ3QWHX2CVWHZ6L2A3QVQKBRANCNFSM4JRM6BQQ .

ptesavol commented 4 years ago
  1. Start playing any YouTube in another Chrome tab. Make sure the audio is played from speaker.

I can confirm this problem, it would be nice to find a solution.

The problem is there even if the video plays in the same tab in a video tag, so it is not an issue about the Youtube video running in another tab.

I came across this problem in my own code when recording audio with MediaRecorder and playing a video/audio using MSE at the same time in the same tab. Could this have to do with the video being played back using MSE? One could assume Youtube would use MSE as well?

stephenlb commented 4 years ago

Also confirmed. Using echoCancellation: true constraint. This is labeled as supported in Chrome, etc. However it does not reduce echo.

rikzin commented 4 years ago

Why is it supposed to mute another source audio?

It is not develped as subtractive noise Cancellation from youtube as a feature.

Echo Cancellation is for the feedback loop between the microphone input and speaker output.

On Thu, May 7, 2020, 10:53 Stephen Blum notifications@github.com wrote:

Also confirmed. Using echoCancellation: true constraint. This is labeled as supported in Chrome, etc.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/webrtc/samples/issues/1243#issuecomment-625405656, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPGYXJ35XCP5GQW4DCLJ4LRQLYSVANCNFSM4JRM6BQQ .

stephenlb commented 4 years ago

Why is it supposed to mute another source audio? It is not develped as subtractive noise Cancellation from youtube as a feature. Echo Cancellation is for the feedback loop between the microphone input and speaker output. On Thu, May 7, 2020, 10:53 Stephen Blum @.***> wrote: Also confirmed. Using echoCancellation: true constraint. This is labeled as supported in Chrome, etc. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1243 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPGYXJ35XCP5GQW4DCLJ4LRQLYSVANCNFSM4JRM6BQQ .

Agreed. This seems like how it should work, right? Audio from speakers is being sent back through microphone, echoing on the recipients device. In a voice conversation, each call party member hears their voice echo. "Echo Cancellation" should remove the echo.

chrbsg commented 4 years ago

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track.

Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine).

Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

stephenlb commented 4 years ago

Excellent. @chrbsg :+1: Thank you

stephenlb commented 4 years ago

Just thinking that the browser should not report "echCancellation" as true or have a way to distinguish level of scope.

GerryWilko commented 4 years ago

Does anybody have a solution to this issue at the moment? It kind of makes WebRTC a bit pointless as a concept if we dont have a proper way to remove echo. I am experiencing severe spikes of screetching from testing our solution with two MacBooks.

Firefox, Safari and Chrome seem to have the same issue.

chrbsg commented 4 years ago

@GerryWilko The original bug report here was about cancelling echo from audio being played in other tabs (Youtube). If that is your use case, then there is no working, cross-platform solution. But echo cancellation works ok for cancelling the audio in calls (in general - there are some Android phones where the hardware or drivers didn't implement it properly). Try comparing https://appr.tc/?debug=loopback&audio=echoCancellation=true and https://appr.tc/?debug=loopback&audio=echoCancellation=false.

Just a thought, but if your test devices are in the same room, it's likely that there will be feedback between them. This won't be cancelled, and is one of the weaknesses of modern video conferencing (imagine several people around the same table entering the same chat at the same time).

GerryWilko commented 4 years ago

@chrbsg thanks for that. I have been looking into our echo cancellation issue and it appears it was perhaps related to the audio sample rate. It seems that the significant screetching were experiencing we could hack a fix in by limiting the OPUS bitrate to 8000 manipulating the SDP sent and recieved by the clients.

Not ideal as the audio quality suffers but it solves it for now. My next task is to begin experimenting with the sample rates and work out how I can properly clear off that issue.

I think this was exacerbated by both sides using the same MacBook which both had default selected very high sample rates.

@chrbsg apologies if I dropped into the wrong issue. I was stuggling to find much out there about this issue and my initial thoughts were this was related to the echo cancellation.

Disclaimer: I'm new to much of this stuff so apologies if I am using the wrong terms for things :)

FullstackJack commented 3 years ago

Since browsers don't seem to implement echoCancellation as meaning to cancel audio coming out from the speakers, does anyone know of source code, white papers, libraries, etc to explain how to cancel the audio from the speakers (i.e. YouTube) in the JavaScript layer? Is this even possible? In latest versions of WebRTC, we can get audio from desktop when requesting displays, perhaps this audio track can be used as source in cancellation?

It should be noted what Google says about their use of echoCancellation: "An echo canceller tries to remove any sound played out on the speakers from the audio signal that's picked up by the microphone." Does it remove the speaker signal from the mic input or does it remove the mic signal from the speaker? This is so confusing.

https://developers.google.com/web/updates/2017/12/disabling-hardware-noise-suppression

chrbsg commented 3 years ago

@FullstackJack it is not possible to read the audio that is being played by other arbitrary tabs in the browser. It would be a security risk if javascript code could snoop on a voice call in a completely different tab.

gutmann-dev commented 3 years ago

@chrbsg thanks for that. I have been looking into our echo cancellation issue and it appears it was perhaps related to the audio sample rate. It seems that the significant screetching were experiencing we could hack a fix in by limiting the OPUS bitrate to 8000 manipulating the SDP sent and recieved by the clients.

Not ideal as the audio quality suffers but it solves it for now. My next task is to begin experimenting with the sample rates and work out how I can properly clear off that issue.

I think this was exacerbated by both sides using the same MacBook which both had default selected very high sample rates.

@chrbsg apologies if I dropped into the wrong issue. I was stuggling to find much out there about this issue and my initial thoughts were this was related to the echo cancellation.

Disclaimer: I'm new to much of this stuff so apologies if I am using the wrong terms for things :)

Thanks! This is great comment

rikzin commented 3 years ago

Echo Cancellation is a function to remove the Larson effect. Look it up.

There are DSP algorithms that account for delay and reflections in a single source and use Phase Cancellation.

When there are multiple sound sources (too many tabs playing) then the audio is mixed. It is not an echo, rather a multitude of sounds. If a user is stupid, they won't know to close youtube, and don't know how to press pause either. If a user is smart, they will play a thing in other tabs and want to, so screen sharing and showing a presentation can be a desired feature, therefore this is not a bug.

The usecase of two laptops in a video conference where one and another are not acoustically isolated will not be solved unless AEC is used. Acoustic Echo Cancellation algorithms vary in quality and effectiveness.

I my extensive testing of conference microphones, Digital Signal Processing accompanies the sound sources, and it is possible that certain devices may apply the technique in multiple areas, such as the polycom phone on a SIP trunk, and in a sound card

Reduction of bitrate degrades signal quality so that is not a workaround.

https://www.qsc.com/search/?tx_qscsearch_search%5Bquery%5D=Aec

Read all 28 hits on the search: AEC functions of QSC and then your programmatic brain will be informed as to how we address using the features at the paid to solve this problem professional level.

Learn about beam forming arrays of multiple mics, and stop using laptops that don't have AEC features, in the same room. Read some marketing jargon from the Nureva™ HDL300 Microphone mist

Just Shame the offender, and make them plug in headphones, turn down the volume or leave. You can't code your way into this pervasive problem , education of users and coders is like trying to boil the ocean.

On Fri, Dec 11, 2020, 06:21 gutmann-dev notifications@github.com wrote:

@chrbsg https://github.com/chrbsg thanks for that. I have been looking into our echo cancellation issue and it appears it was perhaps related to the audio sample rate. It seems that the significant screetching were experiencing we could hack a fix in by limiting the OPUS bitrate to 8000 manipulating the SDP sent and recieved by the clients.

Not ideal as the audio quality suffers but it solves it for now. My next task is to begin experimenting with the sample rates and work out how I can properly clear off that issue.

I think this was exacerbated by both sides using the same MacBook which both had default selected very high sample rates.

@chrbsg https://github.com/chrbsg apologies if I dropped into the wrong issue. I was stuggling to find much out there about this issue and my initial thoughts were this was related to the echo cancellation.

Disclaimer: I'm new to much of this stuff so apologies if I am using the wrong terms for things :)

Thanks! This is great comment

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/webrtc/samples/issues/1243#issuecomment-743219498, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPGYXKITZWS2AUNZ3WTZI3SUITFNANCNFSM4JRM6BQQ .

keichenblat commented 3 years ago

What about echo cancellation of sounds played by <audio> (or even <video>) elements from the very same page where WebRTC is running?

For example: I have an

keval101 commented 3 years ago

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

Llorx commented 2 years ago

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

How do you hear the audio if you set the volume to 0?

peterzanetti commented 2 years ago

Here is another valid scenario: I am trying to do transcription (voice-to-text) during a WebRTC call. Works fine except for the fact that there is no AEC outside of the call itself. Despite using getUserMedia() with audio constraints for echoCancellation and noiseSuppression, it has no effect on the transcription, so the remote party's voice coming through the speakers is picked up and transcribed as having come from the recipient instead of the sender (actually both at the same time).

has-n commented 2 years ago

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track.

Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine).

Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

@chrbsg sorry for necroing an old thread but wanted to pick your brain. Do you think the WASM method would work for AEC instead of resampling?

chrbsg commented 2 years ago

@has-n Do you mean implementing AEC in WASM? I'm not convinced it would do anything more than the existing desktop Chrome AEC. Your code has no access to audio produced by other tabs, so the microphone will still pick it up and send it to the other side (which was the original problem that this issue was created for - playing YouTube in another tab, and expecting the audio to be cancelled after being picked up by the microphone). Also there's the potential issue that AEC is computationally intensive, at least libwebrtc AEC3 is.

The WASM remixing was specifically recommended by Paul Adenot of Mozilla, to workaround the fact that Firefox's AEC only works if the input and output audio are part of the same audio graph, using a single sample rate:

Effectively what I think is happening is that you're ending up setting up two audio graphs inside Firefox by asking for a specific sample-rate on one side and doing a getUserMedia on the other side, and never connecting them. If you try to connect them, there will probably be an error. We're NOT mixing those unrelated graphs yet, so the AEC has a silent reverse stream and it fails completely.

This is being fixed as we speak in multiple ways:

  • We're implementing adaptive resampling and drift compensation to be able to connect unrelated graphs. This will have latency and performance implications. It takes a long time because it's non trivial at the latency figures and quality we need to not regress: Firefox is being used in production by music professionals and we can't justify increasing audio latencies to bring new features.
  • We're investigating using an OS-provided reverse stream (bug is old but the priority has been increased internally because everybody is using webrtc these days) on OSes where it's possible (available and partly implemented on Windows and Linux Desktop).

That said, and without trying to minimize the fact that you're clearly facing a Firefox limitation, it's more efficient (and always will be) to decode and resample the incoming Opus stream into a web worker and to play this out using an AudioWorklet (regardless of our changes). If that's of interest to you I have production ready javascript code that implements a lot of what you need for this (very liberally licensed, that shouldn't be a problem).

If the default sample-rate of the audio output device you're using is 48000Hz, it should work as is. This might be why you hear it working sometimes. You can check the default sample-rate on about:support, in the Media section.

Is it possible to resample a continuous stream with something like an OfflineAudioContext? The examples I've found only show resampling of fixed length audio buffers.

No, get yourself a resampler and use that, from a worker. AudioBufferSourceNode and OfflineAudioContext are well suited for non-live audio. You want to do playback using an AudioWorkletNode. I'd say, compile this to WASM (this is the resampler we use inside Firefox).

I'd be very interested in seeing your code for the opus decode Web Worker / AudioWorklet - we do want to get decoding done on another thread.

My code is about playback of audio content from a worker, and doesn't decode opus per se (it seems you have this part done). https://github.com/padenot/ringbuf.js has lots of docs and examples. It works in Chrome stable, and will work in Firefox 78 (released on the 30th of june) but you can try in Nightly (we're reenabling SharedArrayBuffer). With this, you can have one end of the ringbuffer in the AudioWorklet (basically copy/paste the example, it's dequeuing audio and playing it out), and the writing end is in a worker, that does Fetch calls, and decodes the opus, and resamples to the native audio rate.

If you need something working today, you can write a second path for when SharedArrayBuffer is not supported, using postMessage.

(The original problem that I had was to play an Opus stream with precise control over timing, which meant I couldn't just use the browser to play an RTP audio track. I solved this by compiling libopus to WASM, sending the Opus over a WebRTC datachannel, using libopus to decode the Opus frames, and playing out the resulting PCM using Web Audio. The problem was that Chrome does AEC on the WebRTC RTP tracks and not Web Audio. This was solved by adapting Alex Ciarlillo's loopback hack. This is inefficient, especially for Firefox, where the above solution would be better, but it works as a last resort on both Chrome and Firefox.)

rafalsk commented 2 years ago

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track. Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine). Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

@chrbsg sorry for necroing an old thread but wanted to pick your brain. Do you think the WASM method would work for AEC instead of resampling?

(..) noise cancellation for web-rtc streams used to work fine for us, but that is no more in recent versions of Chromium.

phsultan commented 2 years ago

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC: chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

peterzanetti commented 2 years ago

That's very interesting, I'm going to test this. Strange that this flag even exists, and isn't enabled by default. I can't imagine the value of it being disabled.

peterzanetti commented 2 years ago

Tested this today and unfortunately did not work for me.

theicfire commented 2 years ago

fwiw using chrome://flags/#chrome-wide-echo-cancellation does work for me. I'm on MacOS. Seems this thread is related to it: https://bugs.chromium.org/p/chromium/issues/detail?id=1215049

theicfire commented 2 years ago

This is perhaps a naive question but how does one test echo cancellation working at all? That is, without having two computers. Even in the case that browsers support.

There's a great Firefox blog post that links to this fiddle. In Firefox, the echo cancellation modification is significant. It infrequently gets caught in a feedback cycle. In Chrome, feedback cycles happen constantly.

Note that the checkboxes don't work in Chrome b/c Chrome doesn't support the applyConstraints API. But by default echo cancellation is on, so I would think it could work as well as Firefox.

But maybe audio feedback cycles are different than echo cancellation?

Here's another example with echoCancellation on. It works far better on Firefox and seems to have no effect on Chrome.

peterzanetti commented 2 years ago

I tested it in my video conferencing app using 2 computers using Chrome in different locations. Aside from video and audio, there is transcription being done with web speech API.

The simple way to test whether this flag "works" or not is to make sure the audio devices (input and output) are separate devices (not a headset or a hardware device with its own echo cancellation)...like a webcam's mic + computer speakers. With this configuration, if there is no native AEC happening, the transcript will get messed up because User A's microphone will pick up User B's voice coming through the speakers, and translate the voice that it is hearing. So the effect is you get User B's speech transcribed twice, one from User B's mic, and once from User A's mic. The only way to avoid this would be for AEC to actually cancel that audio that is picking up from the computer speakers, just as it does for WebRTC audio.

If there was no AEC for the WebRTC audio channel, then it would be worse of course because the users would hear echo. But of course there is AEC for the WebRTC audio.

theicfire commented 2 years ago

Yeah I was specifically curious how to test it without having two computers.

But indeed, with two computers I found a random app to use: https://p2p.mirotalk.com. The source is not webpack'd or anything, so it's easy to go into the source and change echoCancellation to false (using Chrome local overrides)

sungongwei commented 2 years ago

fwiw using chrome://flags/#chrome-wide-echo-cancellation does work for me. I'm on MacOS. Seems this thread is related to it: https://bugs.chromium.org/p/chromium/issues/detail?id=1215049

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC: chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

work for me

peterzanetti commented 2 years ago

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC: chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

Is there anything special you had to do to get it working? I don't understand why it doesn't work for me.

ianido commented 2 years ago

@peterzanetti did you find a way to cancel incoming audio? I am doing a conversational transcription and I am having the same issue, the flag chrome://flags/#chrome-wide-echo-cancellation works if my audio is running in another chrome tab, but I need to use an ipad, no idea how to disable this in an ipad.

peterzanetti commented 2 years ago

No, I've yet to see that this flag actually have any impact on my described use case.

fmonterogit commented 1 year ago

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

Simple and great idea, if you recording a video of yourself, volume is doesn't matter.

tiennguyen1293 commented 1 year ago

So interested!