Closed guest271314 closed 1 year ago
I just noticed https://github.com/WebAudio/web-audio-api-v2/issues/16 which for an unknown reason I am blocked from commenting.
Re https://github.com/WebAudio/web-audio-api-v2/issues/16#issuecomment-713011241
- We're asking the WebRTC Working Group to weigh in regarding transferable MediaStreams to extend this first cut
Note, it is currently possible to transfer a ReadableStream
/WritableStream
representation of a MediaStreamTrack
(on Chromium/Chrome) using MediaStreamTrack API for Insertable Streams of Media https://github.com/w3c/mediacapture-transform and WebRTC Encoded Transform https://github.com/w3c/webrtc-encoded-transform due to Chromium/Chrome supporting Transferable Streams https://github.com/whatwg/streams/blob/main/transferable-streams-explainer.md. See also https://github.com/w3c/mediacapture-extensions/pull/26.
AudioWG calls:
The worker case is WebAudio/web-audio-api-v2#16, and this is agreed upon and will happen and yes, transferable MediaStream
(not just ReadableStream
/WriteableStreams
) will happen.
For the service worker case, it's really unclear. If an app starts playing audio, and then all tabs for this app are closed, how do you pause the audio? There is no way to display a UI element.
For the service worker case, it's really unclear. If an app starts playing audio, and then all tabs for this app are closed, how do you pause the audio? There is no way to display a UI element.
It is not immediately clear in the specification
We're not going to allow a service worker to run indefinitely.
that the ServiceWorker
is intended to become "inactive" at 5 minues, because "5 minutes" does not appear in the specification https://w3c.github.io/ServiceWorker/.
Nonetheless per (Chromium) source code the ServiceWorker
can remain "active" as long as one or more of several conditions are met https://source.chromium.org/chromium/chromium/src/+/master:content/browser/service_worker/service_worker_version.cc;l=609 through 613.
DCHECK(event_type == ServiceWorkerMetrics::EventType::INSTALL ||
event_type == ServiceWorkerMetrics::EventType::ACTIVATE ||
event_type == ServiceWorkerMetrics::EventType::MESSAGE ||
event_type == ServiceWorkerMetrics::EventType::EXTERNAL_REQUEST ||
status() == ACTIVATED)
(On Firefox BroadcastChannel
in a ServiceWorker
can remain active even after the ServiceWorker
is unregistered BroadcastChannel created in ServiceWorker outlives unregistration and page reload https://bugzilla.mozilla.org/show_bug.cgi?id=1676043)
That effectively means that ServiceWorker
can remain "active" without tabs, and still retain the ability to communicate with the ServiceWorker
when a new tab is created.
One Chrome extension developer created a workaround for the un-specified 5 minute "inactive" case here https://bugs.chromium.org/p/chromium/issues/detail?id=1152255#c25.
I created a workaround here https://bugs.chromium.org/p/chromium/issues/detail?id=1152255#c31 and https://bugs.chromium.org/p/chromium/issues/detail?id=1152255#c32 using an extension and window.open()
or <iframe>
and postMessage()
. Therefore we can communicate with the ServiceWorker
any time before or after tab closure and reopening, something like
console
onmessage = (e) => console.log(e);
var f = document.createElement('iframe');
f.style = 'display:none';
document.body.appendChild(f);
f.src = 'chrome-extension://jmnojflkjiloekecianpibbbclcgmhag/keepActive.html';
manifest.json
...
"background": {
"service_worker": "background.js"
},
"permissions": ["nativeMessaging", ...],
"host_permissions": ["<all_urls>"],
"web_accessible_resources": [ {
"resources": ["keepActive.html", "keepActive.js"],
"matches": [ "https://bugs.chromium.org/*", ...],
"extensions": [...]
}],
...
See WebAudio/web-audio-api#2381. Add URL's to 'matches' at 'web_accessible_resources', attached image was test at console on github.
background.js
let now = performance.now();
self.addEventListener('message', async (e) => {
e.source.postMessage(((performance.now() - now) / 1000) / 60);
});
keepActive.html
<!DOCTYPE html><html><body><script src="keepActive.js"></script></body></html>
keepActive.js
onload = async () => {
parent.postMessage('ServiceWorker opener', '*');
navigator.serviceWorker.addEventListener('message', e => {
parent.postMessage(e.data, '*');
});
navigator.serviceWorker.ready.then( registration => {
registration.active.postMessage('');
setInterval(() => registration.active.postMessage(''), 1000 * 15);
});
}
Additionally, per my own standard of creating workarounds and proof-of-concepts for the requests I make to specification bodies and implementers, I created a rudimentary proof-of-concept for the feature request I made here, using Native Messaging to launch a headless Chromium instance that plays audio. I included an action handler in the extension so that users can start and pkill
the headless Chromium instance https://bugs.chromium.org/p/chromium/issues/detail?id=1131236#c43
manifest.json
{
"name": "service_worker_native_messaging_headless_audio",
"version": "1.0",
"manifest_version": 3,
"background": {
"service_worker": "background.js"
},
"permissions": ["nativeMessaging"],
"externally_connectable": {
"matches": [
"https://bugs.chromium.org/p/chromium/issues/*"
],
"ids": [
"*"
]
},
"action": {}
}
service_worker_native_messaging_headless_audio.json
{
"name": "service_worker_native_messaging_headless_audio",
"description": "Chromium ServiceWorker Native Messaging audio",
"path": "/path/to/service_worker_native_messaging_headless_audio.sh",
"type": "stdio",
"allowed_origins": [
"chrome-extension://<id>/"
]
}
background.js
chrome.action.onClicked.addListener(() =>
chrome.runtime.sendNativeMessage('service_worker_native_messaging_headless_audio'
, {}, (nativeMessage) => console.log({nativeMessage}))
);
service_worker_native_messaging_headless_audio.sh
#!/bin/bash
sendMessage() {
# https://stackoverflow.com/a/24777120
# Calculate the byte size of the string.
# NOTE: This assumes that byte length is identical to the string length!
# Do not use multibyte (unicode) characters, escape them instead, e.g.
# message='"Some unicode character:\u1234"'
messagelen=${#message}
# Convert to an integer in native byte order.
# If you see an error message in Chrome's stdout with
# "Native Messaging host tried sending a message that is ... bytes long.",
# then just swap the order, i.e. messagelen1 <-> messagelen4 and
# messagelen2 <-> messagelen3
messagelen1=$(( ($messagelen ) & 0xFF ))
messagelen2=$(( ($messagelen >> 8) & 0xFF ))
messagelen3=$(( ($messagelen >> 16) & 0xFF ))
messagelen4=$(( ($messagelen >> 24) & 0xFF ))
# Print the message byte length followed by the actual message.
printf "$(printf '\\x%x\\x%x\\x%x\\x%x' \
$messagelen1 $messagelpen2 $messagelen3 $messagelen4)%s" "$message"
}
headless_audio() {
if pgrep -f 'chrome --headless' > /dev/null; then
pkill -f 'chrome --headless' & sendMessage '"Chromium headless audio off."'
else
$HOME/chrome-linux/chrome --headless --autoplay-policy=no-user-gesture-required --password-store=basic --disable-gpu --remote-debugging-port=9222 audio.html & sendMessage '"Chromium headless audio on."'
fi
}
headless_audio
audio.html
<!DOCTYPE html>
<html>
<head> </head>
<body>
<audio autoplay></audio>
<script>
const audio = document.querySelector('audio');
(async () => {
const request = await fetch('https://ia801306.us.archive.org/8/items/deltanine2015-08-22.mk241_16bit/deltanine2015-08-22.mk241.cmmt30.vms32ub.dr100mkii.16bit-t06.ogg');
const blob = await request.blob();
const blobURL = URL.createObjectURL(blob);
for (let url of [blobURL, 'ImperialMarch60.webm', 'house--64kbs-0-wav.wav']) {
await new Promise((resolve) => {
audio.src = url;
audio.onended = () => {
audio.onended = null;
resolve();
};
});
}
})();
</script>
</body>
</html>
Basically, it is possible to keep the ServiceWorker
"active" indefinitely, and thus communicate with the ServiceWorker
by opening an <iframe>
, Window
or tab within the ServiceWorker
scope
then send the command to start or stop outputting audio.
Per the "5 minute" un-specified implementation on Chrome and Chromium the ServiceWorker
will become "inactive" by default if the user does nothing.
I have been considering the implications and consequences of exposing AudioContext
in ServiceWorker
. Precedence was set by Web Speech API speechSynthesis.speak()
implementations, which does not output audio in the tab itself, rather audio is output by Speech Dispatcher at the OS level, as evidenced by Chromium implementation of getDisplayMedia({audio: true, video: true})
not capturing audio output of speechSynthesis.speak()
Issue 1185527 in chromium: getDisplayMedia does not capture speechSynthesis.speak() audio output https://bugs.chromium.org/p/chromium/issues/detail?id=1185527, and in some cases can survive page reload and still be outputting audio - unless speechSynthesis.cancel()
is called Issue 1066812: Security: Text_To_Speech keeps playing after closing the tab https://bugs.chromium.org/p/chromium/issues/detail?id=1066812, Issue 1107210: Speech Synthesis isn't wired up to "Audio is playing" tab icons. https://bugs.chromium.org/p/chromium/issues/detail?id=1107210.
So to handle the condition described a global ServiceWorkerAudioContext.close()
, ServiceWorkerAudioContext.pause()
, ServiceWorkerAudioContext.resume()
, and ServiceWorkerAudioContext.disconnect()
can be defined which has the effect of pkill
on all running ServiceWorkerAudioContext
instances. We can also tailor the global functions to select specific ServiceWorkerAudioContext
instances that we want to discontinue playing back or otherwise processing audio in the given ServiceWorker
. This can be worked into the UI, if necessary, something like Media Capture and Streams and/or File System Access device or directory permissions, respectively; I emphasize here the programmatic means to do so.
I actually updated the code to keep ServiceWorker
active, substituting Streams API for setInterval()
onload = async () => {
const handleMessage = (e) => {
parent.postMessage(e.data, '*');
};
onmessage = (e) => {
if (e.data === 'abort') {
abortable.abort();
}
};
navigator.serviceWorker.addEventListener('message', handleMessage);
const registration = await navigator.serviceWorker.ready;
const abortable = new AbortController();
const { signal } = abortable;
try {
await new ReadableStream({
start() {
parent.postMessage('Pipe started.', '*');
fetch('keepalive.txt');
},
async pull(controller) {
await new Promise((resolve) => setTimeout(resolve, 1000 * 15));
controller.enqueue(null);
},
}).pipeTo(
new WritableStream({
write(value) {
registration.active.postMessage(value);
},
}),
{ signal }
);
} catch ({message}) {
parent.postMessage(message, '*');
navigator.serviceWorker.removeEventListener('message', handleMessage);
close();
}
};
Video of the workaround using headless Chromium to play audio by utilizing ServiceWorker
to communicate with Native Messaging, which is an example of UI that can be used for the case contemplated; essentially a basic icon that when clicked becomes a list of ServiceWorker
(s) playing or processing audio with an "X" on the side of the item, to affirmatively stop playing or processing audio in the service worker - or if application, unregister the service worker altogether.
If the WG simply write what needs to be written re Web IDL and exposed, then relevant extension contributors can sort out how to maintain AudioContext
in ServiceWorker
extensions system, and WG will not be charged with negligence for just exposing though not writing out what happens when x, y, z, occurs.
On the other hand, WG can take the lead here, which requires taking to time to test scenarios and write it out.
Either way I can effectively achieve the expected result at least one way, right now, without any specification, becuase it is not specified, while the use cases do not appear to be declining as to extended usage of AudioContext
in contexts that WG has not drafted any basic algorithms or path for. Eventually I find a way to achieve the requirement I ask about. Some users in the field are actually awaiting specification authors and implementers to do stuff. I just posted this here to determine where you folks are at re this subject matter. You decide.
Perhaps this feature request is within the scope of Issue 897326: Low Level Audio API https://bugs.chromium.org/p/chromium/issues/detail?id=897326 where chrome.audio is referenced though is now deprecated.
This new model ensures a dedicated scope and an RT thread when allowed for the optimum WASM-powered audio processing
Comparable chrome.* APIs: chrome.audio
...
I logged the Chromium headless output to get a glipse of what is actually occurring.
I had to start a local server to use fetch()
for ArrayBuffer
to set at AudioBufferSourceNode
to test AudioContext
in headless, along with HTMLMediaElement
due to Issue 810400: Fetch API does not respect --allow-file-access-from-files (even though XHR does) https://bugs.chromium.org/p/chromium/issues/detail?id=810400, where I was just using file:
protocol for HTMLMediaElement
<audio>
...
[0626/192132.694631:VERBOSE1:media_stream_manager.cc(705)] MSM::InitializeMaybeAsync([this=0x1be0002cdb00])
[0626/192132.694705:VERBOSE1:media_stream_manager.cc(705)] MDM::MediaDevicesManager()
[0626/192132.694741:VERBOSE1:media_stream_manager.cc(705)] MSM::MediaStreamManager([this=0x1be0002cdb00]))
...
[0626/192132.735603:VERBOSE1:file_url_loader_factory.cc(455)] FileURLLoader::Start: file:///home/ubuntu-studio/localscripts/sw-nm-headless-audio/audio.html
[0626/192132.749611:VERBOSE1:sandbox_linux.cc(69)] Activated seccomp-bpf sandbox for process type: gpu-process.
[0626/192132.752670:VERBOSE1:device_data_manager_x11.cc(216)] X Input extension not available
[0626/192132.768283:VERBOSE1:configured_proxy_resolution_service.cc(852)] PAC support disabled because there is no system implementation
[0626/192132.768710:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::Core() [process_id=4, frame_id=1]
[0626/192132.769386:VERBOSE1:configured_proxy_resolution_service.cc(852)] PAC support disabled because there is no system implementation
[0626/192132.769986:VERBOSE1:document.cc(3704)] Document::DispatchUnloadEvents() URL = <null>
[0626/192132.770205:VERBOSE1:document.cc(3784)] Actually dispatching an UnloadEvent: URL = <null>
[0626/192132.774414:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::Core() [process_id=4, frame_id=1]
[0626/192132.806575:VERBOSE1:gles2_cmd_decoder.cc(3835)] GL_OES_packed_depth_stencil supported.
[0626/192132.809908:VERBOSE1:file_url_loader_factory.cc(455)] FileURLLoader::Start: file:///home/ubuntu-studio/localscripts/sw-nm-headless-audio/ImperialMarch60.wav
[0626/192132.818949:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::RequestDeviceAuthorization({device_id=}) [process_id=4, frame_id=1]
[0626/192132.876218:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted({status=OK}, {params=format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: }, {device_id=default}) [process_id=4, frame_id=1]
[0626/192132.876274:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted => (authorization time=57 ms) [process_id=4, frame_id=1]
[0626/192132.880112:VERBOSE1:media_stream_manager.cc(705)] AMB::MakeAudioOutputStream({device_id=}, {params=[format: PCM_LOW_LATENCY, channel_layout: 2, channels: 1, sample_rate: 44100, frames_per_buffer: 1024, effects: 128, mic_positions: ]})
[0626/192132.880181:VERBOSE1:media_stream_manager.cc(705)] PAOS::PulseAudioOutputStream({device_id=default}, {params=[format: PCM_LOW_LATENCY, channel_layout: 2, channels: 1, sample_rate: 44100, frames_per_buffer: 1024, effects: 128, mic_positions: ]}) [this=0x3df80028bb40]
[0626/192132.880229:VERBOSE1:media_stream_manager.cc(705)] AMB::MakeAudioOutputStream => (number of streams=1)
[0626/192132.880262:VERBOSE1:media_stream_manager.cc(705)] PAOS::Open() [this=0x3df80028bb40]
[0626/192132.883929:VERBOSE1:media_stream_manager.cc(705)] audio::OS::Ctor({audio_manager_name=PulseAudio}, {device_id=default}, {params=[format: PCM_LOW_LATENCY, channel_layout: 2, channels: 1, sample_rate: 44100, frames_per_buffer: 1024, effects: 384, mic_positions: ]}) [controller=0x3DF8003D5160]
[0626/192132.884046:VERBOSE1:media_stream_manager.cc(705)] AOC::CreateStream([state=empty]) [this=0x3DF8003D5160]
[0626/192132.884078:VERBOSE1:media_stream_manager.cc(705)] AOC::RecreateStream({reason=INITIAL_STREAM}, {params=[format: PCM_LOW_LATENCY, channel_layout: 2, channels: 1, sample_rate: 44100, frames_per_buffer: 1024, effects: 384, mic_positions: ]} [state=empty]) [this=0x3DF8003D5160]
[0626/192132.884119:VERBOSE1:media_stream_manager.cc(705)] AOC::CreateStream => (state=created) [this=0x3DF8003D5160]
[0626/192132.884159:VERBOSE1:media_stream_manager.cc(705)] audio::OS::CreateAudioPipe() [controller=0x3DF8003D5160]
[0626/192132.885039:VERBOSE1:media_stream_manager.cc(705)] PAOS::Start() [this=0x3df80028bb40]
[0626/192132.885554:VERBOSE1:media_stream_manager.cc(705)] audio::OS::Play() [controller=0x3DF8003D5160]
[0626/192132.885598:VERBOSE1:media_stream_manager.cc(705)] AOC::Play([state=created]) [this=0x3DF8003D5160]
[0626/192132.885643:VERBOSE1:media_stream_manager.cc(705)] AOC::StartStream => (state=playing) [this=0x3DF8003D5160]
[0626/192133.057425:WARNING:exported_object.cc(263)] Unknown method: message_type: MESSAGE_METHOD_CALL
destination: org.mpris.MediaPlayer2.chromium.instance47698
path: /org/mpris/MediaPlayer2
interface: org.mpris.MediaPlayer2.Playlists
member: GetPlaylists
sender: :1.27
signature: uusb
serial: 6486
uint32_t 0
uint32_t 5
string "Played"
bool true
[0626/192137.885241:VERBOSE1:media_stream_manager.cc(705)] AOC::WedgeCheck => (stream is alive) [this=0x3DF8003D5160]
AudioContext
...
[0626/223935.909046:VERBOSE1:media_stream_manager.cc(705)] MSM::InitializeMaybeAsync([this=0x15e6002cdc80])
[0626/223935.909096:VERBOSE1:media_stream_manager.cc(705)] MDM::MediaDevicesManager()
[0626/223935.909122:VERBOSE1:media_stream_manager.cc(705)] MSM::MediaStreamManager([this=0x15e6002cdc80]))
...
[0626/223935.951525:VERBOSE1:sandbox_linux.cc(69)] Activated seccomp-bpf sandbox for process type: gpu-process.
[0626/223935.957686:VERBOSE1:device_data_manager_x11.cc(216)] X Input extension not available
[0626/223935.990681:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::Core() [process_id=4, frame_id=1]
[0626/223935.993532:VERBOSE1:configured_proxy_resolution_service.cc(852)] PAC support disabled because there is no system implementation
[0626/223935.994321:VERBOSE1:configured_proxy_resolution_service.cc(852)] PAC support disabled because there is no system implementation
[0626/223935.996156:VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: http://localhost:8008/audio.html
[0626/223936.003610:VERBOSE1:document.cc(3704)] Document::DispatchUnloadEvents() URL = <null>
[0626/223936.003879:VERBOSE1:document.cc(3784)] Actually dispatching an UnloadEvent: URL = <null>
[0626/223936.007576:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::Core() [process_id=4, frame_id=1]
[0626/223936.010035:VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: http://localhost:8008/service_worker_native_messaging_headless_audio.js
[0626/223936.022485:VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: http://localhost:8008/ImperialMarch60.wav
[0626/223936.049571:VERBOSE1:webrtc_logging.cc(32)] [WA]AC::AudioContext({latency_hint=exact}, {seconds=0.000}) [state=suspended]
[0626/223936.049783:VERBOSE1:webrtc_logging.cc(32)] [WA]AH::AudioHandler({sample_rate=0}) [type=AudioDestinationNode, this=0x187600315B00]
[0626/223936.049978:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::AudioDestination({output_channels=2}) [state=stopped]
[0626/223936.050049:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::AudioDestination => (FIFO size=12288 bytes) [state=stopped]
[0626/223936.050148:VERBOSE1:webrtc_logging.cc(32)] [WA]RWADI::RendererWebAudioDeviceImpl
[0626/223936.050499:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::RequestDeviceAuthorization({device_id=}) [process_id=4, frame_id=1]
[0626/223936.113349:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted({status=OK}, {params=format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: }, {device_id=default}) [process_id=4, frame_id=1]
[0626/223936.113412:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted => (authorization time=62 ms) [process_id=4, frame_id=1]
[0626/223936.113736:VERBOSE1:webrtc_logging.cc(32)] [WA]RWADI::RendererWebAudioDeviceImpl => (hardware_params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ])
[0626/223936.113884:VERBOSE1:webrtc_logging.cc(32)] [WA]RWADI::RendererWebAudioDeviceImpl => (sink_params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ])
[0626/223936.113966:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::AudioDestination => (device callback buffer size=512 frames) [state=stopped]
[0626/223936.114045:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::AudioDestination => (device sample rate=44100 Hz) [state=stopped]
[0626/223936.114182:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::AudioDestination => (no resampling: context sample rate set to 44100 Hz) [state=stopped]
[0626/223936.114432:VERBOSE1:webrtc_logging.cc(32)] [WA]AC::AudioContext => (base latency=0.012 seconds)) [state=suspended]
[0626/223936.114504:VERBOSE1:webrtc_logging.cc(32)] [WA]AC::StartRendering [state=suspended]
[0626/223936.114578:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::Start [state=stopped]
[0626/223936.114644:VERBOSE1:webrtc_logging.cc(32)] [WA]RWADI::Start
[0626/223936.115069:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::RequestDeviceAuthorization({device_id=}) [process_id=4, frame_id=1]
[0626/223936.116746:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted({status=OK}, {params=format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: }, {device_id=default}) [process_id=4, frame_id=1]
[0626/223936.116826:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted => (authorization time=1 ms) [process_id=4, frame_id=1]
[0626/223936.118530:VERBOSE1:media_stream_manager.cc(705)] AMB::MakeAudioOutputStream({device_id=}, {params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ]})
[0626/223936.118591:VERBOSE1:media_stream_manager.cc(705)] PAOS::PulseAudioOutputStream({device_id=default}, {params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ]}) [this=0x28780028b3c0]
[0626/223936.118627:VERBOSE1:media_stream_manager.cc(705)] AMB::MakeAudioOutputStream => (number of streams=1)
[0626/223936.118661:VERBOSE1:media_stream_manager.cc(705)] PAOS::Open() [this=0x28780028b3c0]
[0626/223936.122206:VERBOSE1:media_stream_manager.cc(705)] audio::OS::Ctor({audio_manager_name=PulseAudio}, {device_id=default}, {params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ]}) [controller=0x2878003D5160]
[0626/223936.122309:VERBOSE1:media_stream_manager.cc(705)] AOC::CreateStream([state=empty]) [this=0x2878003D5160]
[0626/223936.122339:VERBOSE1:media_stream_manager.cc(705)] AOC::RecreateStream({reason=INITIAL_STREAM}, {params=[format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: ]} [state=empty]) [this=0x2878003D5160]
[0626/223936.122368:VERBOSE1:media_stream_manager.cc(705)] AOC::CreateStream => (state=created) [this=0x2878003D5160]
[0626/223936.122397:VERBOSE1:media_stream_manager.cc(705)] audio::OS::CreateAudioPipe() [controller=0x2878003D5160]
[0626/223936.123263:VERBOSE1:webrtc_logging.cc(32)] [WA]AD::RequestRender => (rendering is now alive) [state=running]
[0626/223936.123471:VERBOSE1:media_stream_manager.cc(705)] PAOS::Start() [this=0x28780028b3c0]
[0626/223936.123468:VERBOSE1:webrtc_logging.cc(32)] [WA]RWADI::Render => (rendering is alive [frames=512])
[0626/223936.124227:VERBOSE1:media_stream_manager.cc(705)] audio::OS::Play() [controller=0x2878003D5160]
[0626/223936.124281:VERBOSE1:media_stream_manager.cc(705)] AOC::Play([state=created]) [this=0x2878003D5160]
[0626/223936.124310:VERBOSE1:media_stream_manager.cc(705)] AOC::StartStream => (state=playing) [this=0x2878003D5160]
[0626/223936.304475:VERBOSE1:webrtc_logging.cc(32)] [WA]AH::AudioHandler({sample_rate=44100}) [type=AudioBufferSourceNode, this=0x1876002D0980]
[0626/223936.304892:VERBOSE1:webrtc_logging.cc(32)] [WA]AN::connect({output=[index:0, type:AudioBufferSourceNode, handler:0x1876002D0980]} --> {input=[index:0, type:AudioDestinationNode, handler:0x187600315B00]})
[0626/223936.310042:VERBOSE1:webrtc_logging.cc(32)] [WA]AH::ProcessIfNecessary => (processing is alive [frames=128]) [type=AudioBufferSourceNode, this=0x1876002D0980]
[0626/223941.123684:VERBOSE1:media_stream_manager.cc(705)] AOC::WedgeCheck => (stream is alive) [this=0x2878003D5160]
[0626/223951.136494:VERBOSE1:media_stream_manager.cc(705)] AOC::OnMoreData => (average audio level=-18.02 dBFS) [this=0x2878003D5160]
[0626/224006.136503:VERBOSE1:media_stream_manager.cc(705)] AOC::OnMoreData => (average audio level=-20.08 dBFS) [this=0x2878003D5160]
[0626/224021.141085:VERBOSE1:media_stream_manager.cc(705)] AOC::OnMoreData => (average audio level=-18.74 dBFS) [this=0x2878003D5160]
[0626/224036.141692:VERBOSE1:media_stream_manager.cc(705)] AOC::OnMoreData => (average audio level=-37.86 dBFS) [this=0x2878003D5160]
[0626/224051.147107:VERBOSE1:media_stream_manager.cc(705)] AOC::OnMoreData => (average audio level=-inf dBFS) [this=0x2878003D5160]
Some commonality
RFAOSF::RequestDeviceAuthorization({device_id=}) [process_id=4, frame_id=1]
[0626/192132.876218:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted({status=OK}, {params=format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: }, {device_id=default}) [process_id=4, frame_id=1]
[0626/192132.876274:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted => (authorization time=57 ms) [process_id=4, frame_id=1]
[0626/192132.880112:VERBOSE1:media_stream_manager.cc(705)]
RFAOSF::RequestDeviceAuthorization({device_id=}) [process_id=4, frame_id=1]
[0626/223936.116746:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted({status=OK}, {params=format: PCM_LOW_LATENCY, channel_layout: 3, channels: 2, sample_rate: 44100, frames_per_buffer: 512, effects: 0, mic_positions: }, {device_id=default}) [process_id=4, frame_id=1]
[0626/223936.116826:VERBOSE1:media_stream_manager.cc(705)] RFAOSF::AuthorizationCompleted => (authorization time=1 ms) [process_id=4, frame_id=1]
Therefore we should be able to stop the implementation of "media_stream_manager" on Chromium and equivalent at other implementations by "process_id" and "frame_id".
Teleconf: As we already agreed on supporting a Worker, we'll work on that first, and ServiceWorker at a lower priority while we gather info on use cases and implications of supporting that.
Note: As longer as there is a 5 minute restriction for ServiceWorker
that makes the context become "inactive" this feature will not be particularly useful given alternative approaches.
Due to Chromium/Chrome MV3 extenstion ServiceWorker
implementation capturing system audio output, live media streams, processing data that exceeds 5 minutes are not possible without workarounds, which when I last tested, does not achieve the expected result.
I instead created a Window
that is not displayed to capture system audio and specific devices without using a ServiceWorker
at all https://bugs.chromium.org/p/chromium/issues/detail?id=1189678#c50.
I'd like to add in another use case here: performance in complex production apps.
If AudioContext is available in worker threads, it means that a worker thread can be dedicated specifically to schedule events on the AudioContext (param automation, creating/connecting nodes), without any interference caused by potential main thread load, most commonly caused by UI/DOM (which can be optimized by itself of course).
how is this different from #2423 ?
Primarily this issue was filed for Chromium Manifest Version 3 ServiceWorker
( additionally, fetch()
and WebTransport
are not defined in AudioWorkletGlobalScope
) to illuminate the fact that ServiceWorker
s become inactive after 5 minutes without some workaround, while ServiceWorker
has onfetch
defined, where we should be able to stream in/from the SeriveWorker
.
As per my understanding WorkerGlobalScope
is an abstraction over both Web Worker and Service Worker. #2423 requests BaseAudioContext
to be exposed to WorkerGlobalScope
.
ServiceWorkerGlobalScope
is not WorkerGlobalScope
. I suggest reading at least https://bugs.chromium.org/p/chromium/issues/detail?id=1131236 and https://bugs.chromium.org/p/chromium/issues/detail?id=1152255 for details as to why this issue is different, if at all, from #2423.
The last time I checked Firefox did not support Transferable Streams or Media Capture Transform https://github.com/w3c/mediacapture-transform where we can stream at least raw data from a worker thread to a non-worker thread using Streams API (e.g., Test infinite Opus stream https://plnkr.co/edit/bK1BfoSgjFUDwkIV?preview). That does not deal with 5 minute restriction baked in to ServiceWorker
, were ServiceWorker
s capable of rendering sound, which is technically possible using headless.
This issue, again, seeks to illuminate the fact that even if/when audio rendering is specified for ServiceWorker
specifically, unless the 5 minute restriction is removed from specification spirit, implementations, and workarounds are not used, the SeriveWorker
will become inactive in 5 minutes. Web Audio API and Service Worker specification authors need to work that out. The only sound conclusion from perspective here is getting rid of the 5 minute restriction, though more work is involved than just that.
@GeorgeTailor
how is this different from #2423 ?
One difference between Worker
and ServiceWorker
is the capability to serve media response to clients in onfetch
handler.
This issue, again, seeks to illuminate the fact that even if/when audio rendering is specified for
ServiceWorker
specifically, unless the 5 minute restriction is removed from specification spirit, implementations, and workarounds are not used, theSeriveWorker
will become inactive in 5 minutes. Web Audio API and Service Worker specification authors need to work that out. The only sound conclusion from perspective here is getting rid of the 5 minute restriction, though more work is involved than just that.
We can consistently keep ServiceWorker
persistent using serveral workarounds, including FetchEvent
, messaging, et al.
2022 TPAC: The Worker support for BaseAudioContext is already planned. We'll keep this issue for the future discussion for the ServiceWorker support.
Current options/workarounds for streaming audio from a ServiceWorker
window
or Tab, use MediaSession
with <audio>
elementmpv
in Native Messaging host to stream media at the OS level
I will leave it to the readers to discuss pros/cons of each workaround.Ideally we will fetch media in the ServiceWorker
, perhaps using BackgroundFetch
, which is currently not exposed in extension ServiceWorker
, and use MediaSession
from the ServiceWorker
to control the media being played at global media controls.
I think the design should be something like ServiceWorker
creates an AudioContext
that is streamed to an audio-only PictureInPicture window.
Currently PiP window needs at least 1 frame of a video track to be written to avoid an error. So we would need cooperation from the PiP folks to either create an AudioWindow roughly equivalent to an HTMLAudioElement
with controls or get rid of the video track requirement for PiP - and to be able to create such a window from the ServiceWorker
.
I noticed during experimentation that when a MediaElementAudioSourceNode
is connected to a MediaStreamAudioDestinationNode
then streamed to a PiP window with 1 video frame written thereto just to launch the window I can actually pause playback of the live stream. I need to test more to verify.
Another option is to make BaseAudioContext
transferable.
2023 TPAC Audio WG Discussion:
The WG will not pursue this. ServiceWorker
has nothing to do with audio playback and processing. The Web Worker support is planned (https://github.com/WebAudio/web-audio-api/issues/2423).
To add:
To make this idea work, the lifetime of ServiceWorker and how it interacts with BaseAudioContext needs to be clearly defined. The WG believes that this line of work is outside the scope of the current charter.
If you are planning on specifying this in a Worker
you might as well specify this for a ServiceWorker
.
We can keep the ServiceWorker
active indefinitely for the use case of an Internet radio station, or let the ServiceWorker
life cycle end in ~5 minutes.
It's a shame this is being closed.
Now the implementation in a ServiceWorker
will be outside of your reach, completely controlled by users hacking up SeviceWorker
and Web Audio API.
2023 TPAC Audio WG Discussion:
The WG will not pursue this.
ServiceWorker
has nothing to do with audio playback and processing. The Web Worker support is planned (https://github.com/WebAudio/web-audio-api/issues/2423).
@padenot @hoch It was left open before to track this request, any suggestion of where else a request for some kind of audio support in the background can be filed? Through ServiceWorker, SharedWorker or something else.
@voxpelli Technically we can use a browser extension to create an offscreen document where we play audio "in the background".
Something like
chrome.action.onClicked.addListener(async(tab) => {
if (await chrome.offscreen.hasDocument()) {
await chrome.offscreen.closeDocument();
}
chrome.offscreen.createDocument({
url: 'index.html',
reasons: ['TESTING'],
justification: '',
});
});
<script type="module" src="./index.js"></script>
(async _ => {
let workletStream = new AudioWorkletStream({
urls: [
'house--64kbs-0-wav',
'house--64kbs-1-wav',
'house--64kbs-2-wav',
'house--64kbs-3-wav',
],
latencyHint: 0,
workletOptions: {
numberOfInputs: 1,
numberOfOutputs: 2,
channelCount: 2,
processorOptions: {
codec: 'audio/wav',
offset: 0
},
},
});
})();
Basically this AudioWorkletStream in an offscreen document, for now.
In the ServiceWorker
we can use onfetch
to intercept and manipulate any data requested with import
in the AudioWorkletGlobalScope
by checking the destination
of the reuquest to make sure it's out 'audioworklet'
, or when an intermediary Worker
is used to make fetch
requests supplying data for the AudioWorklet
. For this version the only thing left is creating a UI akin to Media Session API whuch includes artists, album, artwork, etc., see sw-extension-audio that can be controlled from any Web page as the stream continues potentially indefinitely in the background.
I will also begin working on delivering fata directly from the ServiceWorker
to to the underlying speakers of headphones, without necessarily using AudioContext
to do that.
All options are available to hack and exploit ServiceWorker
and Web Audio API to do whatever whatever we want for this use case, without any specification restrictions.
Rudiments of a UI in the form of a an extension popup controllable from any tab utilizing BroadcastChannel
between a popup, ServiceWorker
and background HTML document
.
controller.html
<!doctype html>
<html>
<head>
<script src="./controller.js"></script>
</head>
<body>
<button>Start</button><button>Suspend</button><button>Resume</button>
</body>
</html>
controller.js
onload = async () => {
const bc = new BroadcastChannel("offscreen");
const [start, suspend, resume] = document.querySelectorAll("button");
start.onclick = async () =>
// (await navigator.serviceWorker.ready).active
bc.postMessage("start");
suspend.onclick = async () =>
// (await navigator.serviceWorker.ready).active
bc.postMessage("suspend");
resume.onclick = async () =>
// (await navigator.serviceWorker.ready).active
bc.postMessage("resume");
};
background.js (ServiceWorker
)
const bc = new BroadcastChannel("offscreen");
bc.onmessage = async (e) => {
if (e.data === "start") {
if (await chrome.offscreen.hasDocument()) {
await chrome.offscreen.closeDocument();
}
return await chrome.offscreen.createDocument({
url: "index.html",
reasons: ["TESTING"],
justification: "",
});
}
bc.postMessage(e.data);
};
oninstall = async (event) => {
console.log(event);
event.waitUntil(self.skipWaiting());
};
onactivate = async (event) => {
console.log(event);
event.waitUntil(self.clients.claim());
};
onfetch = async (e) => {
console.log(e.request.url, e.request.destination);
};
// ...
index.js (offscreen document)
globalThis.bc = new BroadcastChannel("offscreen");
bc.onmessage = async (e) => {
console.log(e.data);
if (e.data === "resume") await workletStream.ac.resume();
if (e.data === "suspend") await workletStream.ac.suspend();
};
globalThis.workletStream = new AudioWorkletStream({...});
Next we will experiment with stream audio from the ServiceWorker
to the a local audio application or sound server, e.g., MPV, and PulseAudio.
@voxpelli The roadmap for this work is unclear to me because of the complexity of the interaction between ServiceWorker and AudioContext. I think we need at least two things to reopen and reprioritize this issue in the Audio WG:
Also, closing this issue only means that the WG has other priorities. When the priority changes in the future, the group will definitely reopen this and invite more opinions.
@padenot Please feel free to add any other rationales you have in mind.
I generally agree with what @hoch said.
Audio playback on the web is fundamentally tied to an active document for a host of reasons, and this would be a fundamental change to how a User-Agent is expected to behave by its users.
To take a parallel, on mobile or desktop, in native, there's always something (an app, a command-line program, a widget, or sometimes the system itself in the case of a notification) that is the cause of the sound, and the user can understand what's going on and can easily interact with the thing that is making sound, for example to pause the audio playback.
I'm not saying it's impossible, but rather than there are significant challenges to overcome before this can be worked on.
It's also unclear to me what use-cases this would solve.
It's also unclear to me what use-cases this would solve.
ServiceWorker
. ServiceWorker
using local files, where applicable.We have a detached media stream not necessarily tied to any document
with MediaStreamTrackGenerator
and MediaStreamTrackProcessor
.
This would allow us to pipe to the headphones or speakers in the ServiceWorker
. Leaving the DOM to the DOM.
In a browser extension this means navigation to various Web sites, without needing to keep a document
open, somewhere, just to play audio. With Media Session API the user can control the playback, change channels, etc. in a UI on the browser toolbar, without having to keep a dedicated Tab open, somewhere, rather offscreen document
, or iframe
, or window
just to play or manipulate audo signals. Keeping those documents open costs, perhaps minimally, but for historic puposes only. The technology exists to implement this now.
In a non-extension ServiceWorker
the control of the audio is delegated to the ServiceWorker
context, so users who want to listen to a media stream in the background can. Potentially indefinitely, with the ServiceWorker
fetching and queuing up streams for as long as the user is navigating the site.
I don't think anything needs to change besides doing whatever IDL and exposing AudioContext
in the ServiceWorkerGlobalScope
. Developers can do what they do from there.
If you think people are going to ask questions you can do something like navigator.permissions.request({name: 'service_worker_audio_context'})
. "Grant AudioContext permission for ServiceWorker from origin 'protocol://address'".
We can have dual opt-in, from document
and ServiceWorker
. That would require some ServiceWorker
involvement.
I don't think there are much challenges. Other than the will to experiment and break out of boxes.
Extensions have documents and ways to interact with the code that runs via widgets. MediaStreamTrackGenerator
and MediaStreamTrackProcessor
are always tied to a document
. They can be instantiated in a Worker
, but a Worker
lifetime is tied to the life-time of its document
.
Media Session API
requires a document
and the lifetime is tied to this document. All of this is a lifetime problem.
If you want to be able to get a tab out of the way, ask the browser vendor to provide a way to do so. This has nothing to do with audio playback from a service worker.
Well, a document
os required to create a non-extension ServiceWorker
, so that is not novel.
WebCodec's AudioData
, AudioEncoder
and AudioDecoder
are defined in DedicatedWorkerGlobalScope
.
I find it amazing that folks in the audio realm are so close minded ablout experientation. Nothing can go wrong here by providing a means to connect to speakers from the ServiceWorker
.
For now I suppose I have to create a proof-of-concept by directly streaming audio from the ServiceWorker
to Pulse Audio or Pipewire to demonstrate the Internet s not going to break just because AudioContext
is exposed in a ServiceWorker
and controlled on any open tab.
Not being able to use AudioContext
in ServiceWorker
gives me the impression that the standards team doesn't want web apps to have the same capabilities as traditional Desktop apps.
For example, in a web-based music streaming app, I can use BroadcastChannel
, SharedWorker
or ServiceWorker
to make multiple tabs to communicate and coordinate with each other, like displaying the same now-playing information and controlling the playback from any tabs.
But the playback itself must happen in one of the tabs, if the user closes that tab, the playback stops. I can't resume it in another tab because starting an AudioContext
requires user activation (even if I can, this make the code more complex, and might have a short pause between switching). This creates an imperfect experience for users: "Why can I close all these tabs without affecting the playback, but I can't close that one? On desktop version of [insert music app here], I can close all its windows and the music will continue playing".
I'm not talking about "retain service worker and play audio from it even if it has 0 clients/documents", only allowing tabs to be freely closed/reloaded (because service worker can live through reloading last tab) would be a huge improvement.
Similar story for Media Session API. I have used https://github.com/sinono3/souvlaki to integrate OS media control in an Electron app, for all of Windows, Linux and macOS, they don't need a window to have media session displayed and controlled.
Describe the feature Expose
AudioContext
inWorker
andServiceWorker
contexts.Is there a prototype? No.
Currently users need to use Native Messaging to run local media players, for example
mpv
https://github.com/mpv-player/mpv; see https://github.com/mpv-player/mpv/blob/bc9d556f3a890cf5f99e9dced0117e2d8a91ff09/DOCS/man/javascript.rst, https://github.com/Kagami/mpv.js.Describe the feature in more detail The ability to use
AudioContext
in aWorker
, particularly in aServiceWorker
context.See https://bugs.chromium.org/p/chromium/issues/detail?id=1131236.
Use cases:
<iframe>
or a dedicatedDocument
just to play audio.