WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 165 forks source link

Error on createMediaElementSource() and captureStream() on cross-origin resources which play normally in browser #2547

Closed Quantizr closed 1 year ago

Quantizr commented 1 year ago

Edit: If future me ever comes back and asks me why I created this issue, I was hyperfixated on finding workarounds to Chrome's deprecation of the tabCapture API with manifest v3 browser extensions, and the best workaround seemed to be injecting a script and doing necessary processing through the WebAudio API. However, this only worked for websites which loaded content from the same origin, which was meant that the workaround would work only 90% of the time, leading to this issue and me ignoring security risks. It's still odd to me how a well funded attacker could just create a proxy to bypass the no CORS policy issue while a legitimate use case without funding for bandwidth is unable to do anything.

Describe the issue

createMediaElementSource() and captureStream() only work on a cross-origin resource when the resource has a 'Access-Control-Allow-Origin' header and the HTMLMediaElement uses a crossOrigin attribute.

This means that the two functions will NOT work on any HTMLMediaElement without a crossOrigin attribute, even when the CORS policy allows for cross-origin access or there is no CORS policy.

For example, if a cross-origin resource has a header of access-control-allow-origin: * which should allow all traffic, a crossOrigin attribute should not be necessary to access the resource. Indeed, when not using createMediaElementSource(), such an element will play audio properly. However when using createMediaElementSource() or captureStream(), the error "MediaElementAudioSource outputs zeroes due to CORS access restrictions" appears.

An even more problematic case is when the cross-origin resource has no access-control-allow-origin header at all. Normally, the element will load and play audio properly, however when createMediaElementSource() or captureStream() is used, in which the error "MediaElementAudioSource outputs zeroes due to CORS access restrictions" appears. This is especially problematic as with no access-control-allow-origin, specifying a crossOrigin attribute will result in the error "No 'Access-Control-Allow-Origin' header is present on the requested resource"

If an cross-origin resource is able to play audio to the user normally, the WebAudio API should be able to create a MediaStreamAudioSourceNode or MediaStream from that audio.

Where Is It

The problem comes from here: https://www.w3.org/TR/mediacapture-fromelement/#security-considerations

Media elements can render media resources from origins that differ from the origin of the media element. In those cases, the contents of the resulting MediaStreamTrack MUST be protected from access by the document origin.

Specifically the example:

attempting to create a Web Audio MediaStreamAudioSourceNode [WEBAUDIO] succeeds, but produces no information to the document origin (that is, only silence is transmitted into the audio context)

This makes no sense. Why would only silence be transmitted into the audio context when the actual audio can be output to the user?

Additional Information

As an example of why this would be useful, lets say we wanted to make something like Mozilla's example Voice-change-O-matic website, except with the input as a link to video/audio instead of microphone input. For cross-origin links where we can normally play the video/audio in a normal HTMLMediaElement, such as this lecture, we would now only get silence as the output. An additional slightly related grievance is that, for the above example, one would have to create two different video elements, one with a crossOrigin attribute and one without, as there is no way to automatically apply a crossOrigin attribute for a site with a access-control-allow-origin header and get rid of it if it isn't there.

padenot commented 1 year ago

If an cross-origin resource is able to play audio to the user normally, the WebAudio API should be able to create a MediaStreamAudioSourceNode or MediaStream from that audio.

This is absolutely not possible nor desirable. You cannot inspect the content of resources that are not same-origin, either because they're on the same origin or because CORS headers or whatnot have been specified.

When playing back audio and/or video using a media element, the content of the resource isn't accessible to the page, it's hearable / visible by the user, but e.g. the JavaScript code cannot access the waveform / pixels. If you tried to draw the video on a canvas in an attempt to read back the bytes, you'd see that you only get black pixels. The same goes for cross-origin images, and the same goes for audio.

https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy explain why this is the case.