Open WofWca opened 2 years ago
Turns out the "b)" part is a duplicate of https://github.com/WebAudio/web-audio-api/issues/1202, sorry. I'll write some comments there. But a) holds anyway.
For context, I'm developing an extension which calls
AudioContext.createMediaElementSource
for every media element on every page.In a perfect world it would also be nice if createMediaElementSource always just worked for browser extensions somehow.
What is the purpose of creating a MediaElementSourceNode
for every element on the page?
What are you trying to achieve?
In a perfect world it would also be nice if createMediaElementSource always just worked for browser extensions somehow.
If you are using a browser extension you can fetch resources outside of CORS and CSP restrictions at a page listed in "web_accessible_resources"
, in ServiceWorker
, or using a native application, send the data to the main page using messaging or transferable objects, then set the src
of the HTMLMediaElement
to a Blob URL in the document.
To check if the HTMLAudioElement
is a cross-origin resource served without Access-Control-Allow-Origin
header listing the origin requested from is to set the crossorigin
attribute on the <audio>
element then observe both loadedmetadata
(fired when the header is set) and error
(fired when the header is not set) events.
<!DOCTYPE html>
<html>
<head> </head>
<body>
<audio
src="http://ftp.nluug.nl/pub/graphics/blender/demo/movies/ToS/tears_of_steel_surround.webm#t=10,20"
controls
crossorigin=""
></audio>
<audio
src="https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20"
controls
crossorigin=""
></audio>
<script>
for (const mediaElement of document.querySelectorAll('audio')) {
mediaElement.onerror = (e) => {
console.error(e.type, e.target.src);
};
mediaElement.onloadedmetadata = (e) => {
console.log(e);
const ac = new AudioContext();
const source = new MediaElementAudioSourceNode(ac, { mediaElement });
console.log(source);
source.connect(ac.destination);
};
}
</script>
</body>
</html>
What is the purpose of creating a
MediaElementSourceNode
for every element on the page?
To analyze the volume of the media. Here's the extension itself: https://github.com/WofWca/jumpcutter.
you can fetch resources outside of CORS and CSP restrictions
The method you described is very hard (if possible) to implement so that it works on (almost) every website, and it's a bit invasive (it requires changing the src
(or srcObject
)). Also I believe it's gonna breach the CORS protection (a malicious website would become able to access the cross-origin media (e.g. by el.captureStream()
).
then observe both
loadedmetadata
(fired when the header is set) anderror
(fired when the header is not set) events.
This to me also sounds like a headache to get working everywhere. It requires either creating a new element with the same source, or manipulating the original one.
The former may not always work because the fetch may succeed for the original element but fail for the new one for reasons other than CORS restrictions. Currently this doesn't work, for example, on YouTube, where it uses MediaStream
with URL.createObjectURL
.
The latter is pretty invasive. For example, the website may also attach an error
event listener that may redirect the user to an error page.
Anyway, I appreciate your response.
The former may not always work because the fetch may succeed for the original element but fail for the new one for reasons other than CORS restrictions.
What reasons would make approach fail?
The simplest approach would be yo fetch()
the resource with a HEAD
request. If successful you can use MediaElementAudioSourceNode
for capturing, else you cannot.
Currently this doesn't work, for example, on YouTube, where it uses MediaStream with URL.createObjectURL.
You can use getDisplayMedia()
to capture media being played on YouTube. Alternatively, you can avoid MediaStreamAudioSourceNode
and captureStream()
and getDisplayMedia()
altogether by capturing specific one or more --monitor-stream
with parec
, that is, capturing Chromium of Firefox web media player stream ongoing in the tab wusing PulseAudio.
If I understand the extension correctly, if you are playing a video on YouTube you can capture with getDisplayMedia({video: true, audio: true})
then call stream.removeTrack(stream.getVideoTracks()[0])
since video is not being processed, connect to a MediaStreamAudioDestinationNode
and an AnalyzerNode
. No CORS restrictions are involved. Since you are using an extension you can capture the underlying stream at OS level ("What-U-Hear" ) to avoid CORS restrictions https://github.com/guest271314/captureSystemAudio/tree/master/native_messaging/capture_system_audio, or specific --monitor-stream
, see pactl list sources
, pactl list source-outputs
(https://github.com/guest271314/setUserMediaAudioSource), process the raw PCM through AudioWorklet
or MediaStreamTrackGenerator
(on Chromium) then use the AnalyzerNode
, or the approach you currently use if not. Since you are using an extension you can get all src
s, load the resources in the "web_accessible_resources"
page to test if CORS applies, then you know which elements will be able to use MediaStreamAudioSourceNode
.
How about using Object URL ?
fetch('https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20')
.then((response) => response.blob())
.then((blob) => {
const objectURL = window.URL.createObjectURL(blob);
const mediaElement = document.querySelector('audio');
mediaElement.src = objectURL;
const context = new AudioContext();
const source = new MediaElementAudioSourceNode(context, { mediaElement });
source.connect(context.destination);
mediaElement.play();
})
.catch(console.error);
How about using Object URL ?
Creation of a Blob URL does will not be reached when Access-Control-Allow-Origin
header is not served, the important part is using catch()
to handle error, meeting criteria a) detect if MediaElementAudioSourceNode
is CORS-restricted, thus we do not reach b) revert createMediaElementSource
where we do not call the method if a) is true.
Audio WG call:
This way this works consistently with HTMLMediaElement.captureStream()
+ AudioContext.createMediaStreamSource()
, or any variation on that. The same-origin status really is a property of the source.
@guest271314 I'll take time to think about the workarounds you suggested, but currently I'm inclined to think that having such API would be nice either way.
Also regarding b):
The idea on https://github.com/WebAudio/web-audio-api/issues/2453#issuecomment-948809702 still stands. The CORS property should be queried via MediaElement.
This Stack Overflow answer states that captureStream()
throws if there is a CORS issue. If that's true than it could be used to get the information asked for in this thread.
But it looks like that behavior is only specified for a HTMLCanvasElement.captureStream()
.
Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a SecurityError exception if the canvas is not origin-clean.
It is not specified for HTMLMediaElement.captureStream()
. The specification for that just says:
The contents of the track might become inaccessible to the current origin due to cross-origin protections. For instance, content that is rendered from an HTTP URL can be subject to a redirect on a request for partial content, or the enabled or selected tracks can change to include cross-origin content.
@chrisguttandin
If you can hear it you should be able to capture and manipulate it.
As I suggested above, and as is demonstrated in the SO answer, use of crossorigin="anonymous"
attribute and value solves the issue for the SO question. No CORS error at audio element, no AudioContext
error, media is routed through MediaElementAudioSourceNode
, and can be recorded using MediaStreamAudioDestinationNode
and MediaRecorder
.
I will note that setting crossorigin
using DOM does not have the same effect of setting crossorigin
in HTML, which is interesting
Even though the URL is served with Access-Control-Allow-Origin: *
we still get will only output silence warning/error when crossorigin
attribute is not set in the HTML on Firefox 104
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<script>
function createAudioNode() {
const audioElement = document.querySelector('audio');
audioElement.onerror = e => console.log(e);
audioElement.onerror = (e) => {
console.log(e);
}
// doesn't work if not in HTML
// audioElement.crossorigin = 'anonymous';
audioElement.src = 'https://upload.wikimedia.org/wikipedia/commons/4/40/Toreador_song_cleaned.ogg';
const audioContext = new AudioContext();
const audioNode = audioContext.createMediaElementSource(audioElement);
const msd = new MediaStreamAudioDestinationNode(audioContext);
audioNode.connect(msd);
const recorder = new MediaRecorder(msd.stream);
recorder.ondataavailable = (e) => console.log(URL.createObjectURL(e.data));
recorder.start();
audioElement.onpause = () => {
recorder.stop();
}
audioNode.connect(audioContext.destination);
}
</script>
<audio
controls crossorigin>
Your browser does not support the <code>audio</code> element.
</audio>
<br />
<button onclick="createAudioNode()">Create audio node</button>
</body>
</html>
Describe the feature
Perhaps for a) a boolean property and an event emission (in case e.g.
element.currentSrc
changed from same-origin to cross-origin or vice versa) would do.b) is needed in order to not "break" the element whose
src
is CORS-restricted so at least it is possible for the user to hear its sound. It also can come in handy if you decided to switch to a differentAudioContext
(e.g. you don't likesampleRate
orlatencyHint
of the current one).Or perhaps
createMediaElementSource
needs to be replaced with something that doesn't cause this trouble.Is there a prototype? No.
Describe the feature in more detail
For context, I'm developing an extension which calls
AudioContext.createMediaElementSource
for every media element on every page. As we know, cross-origin media playback is allowed, butMediaElementAudioSourceNode
outputs zeroes for such media (https://webaudio.github.io/web-audio-api/#MediaElementAudioSourceOptions-security), which means that when you callcreateMediaElementSource
for such media it simply gets muted with no conservative (meaning without re-creating the element or something like this) way to revert it, which (at least in my case) is worse than not calling the method at all. Some websites fetch media data from a different origin (and even a different subdomain is a different origin - see, for example, Zoom recordings), but theAccess-Control-Allow-Origin
header is not always applied to such media responses, which makescreateMediaElementSource
(therefore my extension) not only useless but even harmful.In a perfect world it would also be nice if
createMediaElementSource
always just worked for browser extensions somehow. Web Audio API may not be the only scope this issue lies in in this case (then I would appreciate if someone gave me directions on how to proceed).And FYI I'm not the greatest web guru, I may just be missing something.