WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 166 forks source link

The behavior is not defined for "A MediaStream added to an AudioContext with the different sample rate" case #2322

Closed apatrc closed 3 years ago

apatrc commented 3 years ago

Describe the issue

The behavior of the user agent is not described for the case:

  1. AudioConext created with sample rate = X.
  2. A MediaStream captured/received with sample rate = Y.
  3. A MediaStreamAudioSourceNode created inside the AudioContext
  4. Media for the MediaStreamAudioSourceNode received from the MediaStream.

Diagram:

MediaStream[sampleRate=X] -> AudioContext[sampleRate=Y] ( MediaStreamAudioSourceNode )

Two options: User agent may not allow this, since sample rate of the AudioContext is different from sample rate of the MediaStream. OR User agent may resample MediaStream to the sample rate of the AudioContext.

Where Is It I would expect clarification under description of MediaStreamAudioSourceNode: https://www.w3.org/TR/webaudio/#mediastreamaudiosourcenode

Additional Information

Current behavior of browsers is different:

Chromium based: Resamples MediaStream to the sample rate of the AudioContext.

Ticket where Chromium developers discovered the issue (broken sound) and fixed it by introducing resampling: https://bugs.chromium.org/p/chromium/issues/detail?id=945305

FireFox: Throws: "DOMException: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported."

Minimum Example:

<html lang="en">
<head>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type">
</head>
<body>
<script>
    (async ()=>{

        //We assume that stream will be captured with 44.1Khz or 48Khz
        const stream = await navigator.mediaDevices.getUserMedia({
            audio: true,
            video: false
        });

        //We create audio context with different sample rate
        const audioCtx = new AudioContext({sampleRate: 8000});

        //Minumum pipeline
        //MediaStream[48Khz]->AudioContext[8Khz]{ MediaStreamSource -> MediaStreamDestination }
        const audioSourceNode = audioCtx.createMediaStreamSource(stream);
        const destinationNode = audioCtx.createMediaStreamDestination();
        audioSourceNode.connect(audioCtx.destination);
        audioSourceNode.connect(destinationNode);
    })().catch(e=>console.error(e));
</script>
</body>
</html>

In this example, we capture a MediaStream from the default microphone, which normally gives a 48 kHz stream. Then we add it to the AudioContext, where we connect it to the destination. The AudioContext is created with a sampling rate of 8 kHz. In Chromium you will hear a resampled 8 kHz stream. In FireFox you will get an error in the browser console.

Proposal User agent MUST resample in this scenario (same as Chromium does).

rtoy commented 3 years ago

Thanks for reporting this. I think resampling makes the most sense, just like how decodeAudioData will resample the decoded audio file to the context sample rate.

rtoy commented 3 years ago

May also want to consider doing this to v1 instead.

padenot commented 3 years ago

In FireFox you will get an error in the browser console.

FWIW, this is currently being implemented in Firefox, we are well aware that this is problematic. Thanks for the report!

rtoy commented 3 years ago

Teleconf: We should spec this in v1. Moving issue to v1.

svgeesus commented 3 years ago

So ... is the edit going to be done in time for the Proposed Recommendation next week?

Otherwise, it would become the first Candidate Addition on the new Recommendation :)

svgeesus commented 3 years ago

As we already do resampling, I think we just forgot to say so in this case. If it were not for the Firefox bug on this, which as @padenot says is being fixed, it would just be a trivial editorial. Even so, I would argue it is not substantive.

rtoy commented 3 years ago

I can have a PR up today for someone to review. We just need to add a sentence somewhere that resampling must be done.

rtoy commented 3 years ago

Does this also mean MediaStreamTrackAudioSourceNode and MediaElementSourceNode also need to specify resampling? I would think so, but I haven't tried it.