WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 165 forks source link

A way to a) detect if `MediaElementAudioSourceNode` is CORS-restricted & b) revert `createMediaElementSource` #2453

Open WofWca opened 2 years ago

WofWca commented 2 years ago

Describe the feature

Perhaps for a) a boolean property and an event emission (in case e.g. element.currentSrc changed from same-origin to cross-origin or vice versa) would do.

b) is needed in order to not "break" the element whose src is CORS-restricted so at least it is possible for the user to hear its sound. It also can come in handy if you decided to switch to a different AudioContext (e.g. you don't like sampleRate or latencyHint of the current one).

Or perhaps createMediaElementSource needs to be replaced with something that doesn't cause this trouble.

Is there a prototype? No.

Describe the feature in more detail

For context, I'm developing an extension which calls AudioContext.createMediaElementSource for every media element on every page. As we know, cross-origin media playback is allowed, but MediaElementAudioSourceNode outputs zeroes for such media (https://webaudio.github.io/web-audio-api/#MediaElementAudioSourceOptions-security), which means that when you call createMediaElementSource for such media it simply gets muted with no conservative (meaning without re-creating the element or something like this) way to revert it, which (at least in my case) is worse than not calling the method at all. Some websites fetch media data from a different origin (and even a different subdomain is a different origin - see, for example, Zoom recordings), but the Access-Control-Allow-Origin header is not always applied to such media responses, which makes createMediaElementSource (therefore my extension) not only useless but even harmful.

In a perfect world it would also be nice if createMediaElementSource always just worked for browser extensions somehow. Web Audio API may not be the only scope this issue lies in in this case (then I would appreciate if someone gave me directions on how to proceed).

And FYI I'm not the greatest web guru, I may just be missing something.

WofWca commented 2 years ago

Turns out the "b)" part is a duplicate of https://github.com/WebAudio/web-audio-api/issues/1202, sorry. I'll write some comments there. But a) holds anyway.

guest271314 commented 2 years ago

For context, I'm developing an extension which calls AudioContext.createMediaElementSource for every media element on every page.

In a perfect world it would also be nice if createMediaElementSource always just worked for browser extensions somehow.

What is the purpose of creating a MediaElementSourceNode for every element on the page?

What are you trying to achieve?

guest271314 commented 2 years ago

In a perfect world it would also be nice if createMediaElementSource always just worked for browser extensions somehow.

If you are using a browser extension you can fetch resources outside of CORS and CSP restrictions at a page listed in "web_accessible_resources", in ServiceWorker, or using a native application, send the data to the main page using messaging or transferable objects, then set the src of the HTMLMediaElement to a Blob URL in the document.

guest271314 commented 2 years ago

To check if the HTMLAudioElement is a cross-origin resource served without Access-Control-Allow-Origin header listing the origin requested from is to set the crossorigin attribute on the <audio> element then observe both loadedmetadata (fired when the header is set) and error (fired when the header is not set) events.

<!DOCTYPE html>

<html>
  <head> </head>

  <body>
    <audio
      src="http://ftp.nluug.nl/pub/graphics/blender/demo/movies/ToS/tears_of_steel_surround.webm#t=10,20"
      controls
      crossorigin=""
    ></audio>
    <audio
      src="https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20"
      controls
      crossorigin=""
    ></audio>
    <script>
      for (const mediaElement of document.querySelectorAll('audio')) {
        mediaElement.onerror = (e) => {
          console.error(e.type, e.target.src);
        };
        mediaElement.onloadedmetadata = (e) => {
          console.log(e);
          const ac = new AudioContext();
          const source = new MediaElementAudioSourceNode(ac, { mediaElement });
          console.log(source);
          source.connect(ac.destination);
        };
      }
    </script>
  </body>
</html>
WofWca commented 2 years ago

What is the purpose of creating a MediaElementSourceNode for every element on the page?

To analyze the volume of the media. Here's the extension itself: https://github.com/WofWca/jumpcutter.

you can fetch resources outside of CORS and CSP restrictions

The method you described is very hard (if possible) to implement so that it works on (almost) every website, and it's a bit invasive (it requires changing the src (or srcObject)). Also I believe it's gonna breach the CORS protection (a malicious website would become able to access the cross-origin media (e.g. by el.captureStream()).

then observe both loadedmetadata (fired when the header is set) and error (fired when the header is not set) events.

This to me also sounds like a headache to get working everywhere. It requires either creating a new element with the same source, or manipulating the original one. The former may not always work because the fetch may succeed for the original element but fail for the new one for reasons other than CORS restrictions. Currently this doesn't work, for example, on YouTube, where it uses MediaStream with URL.createObjectURL. The latter is pretty invasive. For example, the website may also attach an error event listener that may redirect the user to an error page.

Anyway, I appreciate your response.

guest271314 commented 2 years ago

The former may not always work because the fetch may succeed for the original element but fail for the new one for reasons other than CORS restrictions.

What reasons would make approach fail?

The simplest approach would be yo fetch() the resource with a HEAD request. If successful you can use MediaElementAudioSourceNode for capturing, else you cannot.

guest271314 commented 2 years ago

Currently this doesn't work, for example, on YouTube, where it uses MediaStream with URL.createObjectURL.

You can use getDisplayMedia() to capture media being played on YouTube. Alternatively, you can avoid MediaStreamAudioSourceNode and captureStream() and getDisplayMedia() altogether by capturing specific one or more --monitor-stream with parec, that is, capturing Chromium of Firefox web media player stream ongoing in the tab wusing PulseAudio.

guest271314 commented 2 years ago

If I understand the extension correctly, if you are playing a video on YouTube you can capture with getDisplayMedia({video: true, audio: true}) then call stream.removeTrack(stream.getVideoTracks()[0]) since video is not being processed, connect to a MediaStreamAudioDestinationNode and an AnalyzerNode. No CORS restrictions are involved. Since you are using an extension you can capture the underlying stream at OS level ("What-U-Hear" ) to avoid CORS restrictions https://github.com/guest271314/captureSystemAudio/tree/master/native_messaging/capture_system_audio, or specific --monitor-stream, see pactl list sources, pactl list source-outputs (https://github.com/guest271314/setUserMediaAudioSource), process the raw PCM through AudioWorklet or MediaStreamTrackGenerator (on Chromium) then use the AnalyzerNode, or the approach you currently use if not. Since you are using an extension you can get all srcs, load the resources in the "web_accessible_resources" page to test if CORS applies, then you know which elements will be able to use MediaStreamAudioSourceNode.

Korilakkuma commented 2 years ago

How about using Object URL ?

fetch('https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20')
  .then((response) => response.blob())
  .then((blob) => {
    const objectURL = window.URL.createObjectURL(blob);
    const mediaElement = document.querySelector('audio');

    mediaElement.src = objectURL;

    const context = new AudioContext();
    const source = new MediaElementAudioSourceNode(context, { mediaElement });

    source.connect(context.destination);

    mediaElement.play();
  })
  .catch(console.error);
guest271314 commented 2 years ago

How about using Object URL ?

Creation of a Blob URL does will not be reached when Access-Control-Allow-Origin header is not served, the important part is using catch() to handle error, meeting criteria a) detect if MediaElementAudioSourceNode is CORS-restricted, thus we do not reach b) revert createMediaElementSource where we do not call the method if a) is true.

2453

padenot commented 2 years ago

Audio WG call:

This way this works consistently with HTMLMediaElement.captureStream() + AudioContext.createMediaStreamSource(), or any variation on that. The same-origin status really is a property of the source.

WofWca commented 2 years ago

@guest271314 I'll take time to think about the workarounds you suggested, but currently I'm inclined to think that having such API would be nice either way.

WofWca commented 2 years ago

Also regarding b):

1285 initially had this commit, which would solve the issue (at least for me), but then it got changed to what we have now.

hoch commented 1 year ago

The idea on https://github.com/WebAudio/web-audio-api/issues/2453#issuecomment-948809702 still stands. The CORS property should be queried via MediaElement.

chrisguttandin commented 1 year ago

This Stack Overflow answer states that captureStream() throws if there is a CORS issue. If that's true than it could be used to get the information asked for in this thread.

But it looks like that behavior is only specified for a HTMLCanvasElement.captureStream().

Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a SecurityError exception if the canvas is not origin-clean.

It is not specified for HTMLMediaElement.captureStream(). The specification for that just says:

The contents of the track might become inaccessible to the current origin due to cross-origin protections. For instance, content that is rendered from an HTTP URL can be subject to a redirect on a request for partial content, or the enabled or selected tracks can change to include cross-origin content.

guest271314 commented 1 year ago

@chrisguttandin

If you can hear it you should be able to capture and manipulate it.

As I suggested above, and as is demonstrated in the SO answer, use of crossorigin="anonymous" attribute and value solves the issue for the SO question. No CORS error at audio element, no AudioContext error, media is routed through MediaElementAudioSourceNode, and can be recorded using MediaStreamAudioDestinationNode and MediaRecorder.

guest271314 commented 1 year ago

I will note that setting crossorigin using DOM does not have the same effect of setting crossorigin in HTML, which is interesting

Even though the URL is served with Access-Control-Allow-Origin: * we still get will only output silence warning/error when crossorigin attribute is not set in the HTML on Firefox 104

<!DOCTYPE html>

<html>
  <head>
  </head>

  <body>
    <script>
      function createAudioNode() {
        const audioElement = document.querySelector('audio');
        audioElement.onerror = e => console.log(e);
        audioElement.onerror = (e) => {
          console.log(e);
        }
        // doesn't work if not in HTML
        // audioElement.crossorigin = 'anonymous';

        audioElement.src = 'https://upload.wikimedia.org/wikipedia/commons/4/40/Toreador_song_cleaned.ogg';

        const audioContext = new AudioContext();
        const audioNode = audioContext.createMediaElementSource(audioElement);
        const msd = new MediaStreamAudioDestinationNode(audioContext);
        audioNode.connect(msd);
        const recorder = new MediaRecorder(msd.stream);
        recorder.ondataavailable = (e) => console.log(URL.createObjectURL(e.data));
        recorder.start();
        audioElement.onpause = () => {
          recorder.stop();
        }
        audioNode.connect(audioContext.destination);
      }
    </script>
    <audio
      controls crossorigin>
      Your browser does not support the <code>audio</code> element.
    </audio>
    <br />
    <button onclick="createAudioNode()">Create audio node</button>
  </body>
</html>