Open Liamolucko opened 2 years ago
As you mentioned, the WG's intention is to limit the exposure of AudioWorkletGlobalScope only for the worklet-based audio processing. I am not sure if there's a way to notate the exception of specific Globals using WebIDL.
It also seems like there's inconsistency between the spec and the actual implementation: Chrome's IDL FireFox's IDL
We'll triage this issue in the next teleconference and think about solution.
See also https://github.com/WebAudio/web-audio-api-v2/issues/79 re WebAssembly.compileStreaming()
and WebAssembly.instantiateStreaming()
. Both are defined/exposed on the WebAssembly
object in AudioWorkletGlobalScope
, neither works as intended because Response
is not exposed in AudioWorkletGlobalScope
.
Although OP states
To be clear, I'm not saying they should be added to
AudioWorkletGlobalScope
; I'm saying that it needs to be specified somewhere that they aren't there.
for completeness an encoding polyfill can be used. I modified https://github.com/inexorabletash/text-encoding/blob/master/lib/encoding.js in order to import TextEncoder
and TextDecoder
into a QuickJS application by substituting globalThis
for this
and including this at the end of the script
const {TextEncoder, TextDecoder} = globalThis;
export {TextEncoder, TextDecoder};
which we can import into AudioWorkletGlobalScope
import {TextEncoder, TextDecoder} from './encoding.js';
console.log(TextEncoder, TextDecoder);
const encoder = new TextEncoder();
const decoder = new TextDecoder();
let data = encoder.encode('abc');
console.log(data);
console.log(decoder.decode(data));
encoding.js.zip. There are polyfills for URL
, e.g. https://github.com/lifaon74/url-polyfill, which can be adjusted and imported similarly. Polyfilling Response
is more involved.
Technically we can encode and decode strings with the following alone
let encoded = new Uint8Array([97, 98, 99]);
let str = String.fromCharCode(...encoded);
let encode = new Uint8Array([...str].map((s) => s.charCodeAt()));
console.log({encoded, str, encode});
@Liamolucko It looks like we need to open issues against TextEncoder and URL specs. Could you do that?
FYI, TextEncoder/TextDecoder were made [Exposed=*]
here https://github.com/whatwg/encoding/commit/f09648a96ef4cffbf68734a5f81b7ececf2e311e with discussion at https://github.com/tc39/proposal-shadowrealm/issues/331 - TL;DR there are a set of APIs specified as Web APIs that could just as easily have been specified as parts of ECMAScript (streams, encoding, URL, etc) and it was felt that these should be present in every context.
So... might require discussion w/ TC39 folks etc. if you want to revisit that.
More specifically: if you have a reason to exclude these APIs, it would follow there are parts of ECMAScript that perhaps you'd also want to exclude (e.g. parts of ECMA-402 Internationalization?), now or in the future.
That's a good point. Thanks @inexorabletash!
2023 TPAC Audio WG Discussion: The WG decided to remove some globally exposed interfaces, but https://github.com/WebAudio/web-audio-api/issues/2499#issuecomment-1716624415 suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.
Pinging implementors to get more thoughts: @padenot @jyavenard
I think several Web API's are not exposed/defined in AudioWorkletGlobalScope
, just because. Not based on any actual evidence that said non-exposed methods or Web API's will negatively impact AudioWorkletProcessor
.
There should be some empirical evidence to based decisions on, rather than just mere speculation, or arbitrarily excluding Web API's for no reason that can be substantiated by facts.
E.g.,
fetch()
Response
WebSocket
WebTransport
to name a few.
Somebody else brought up atob()
.
I don't see any reason why those global methods should not be defined.
suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.
The problem with excluding methods from an API is users will nonetheless, inevitably, want to use those methods in the API.
An example of the exclusion or gating of API's can be observed with Chrome's decision to gate Direct Sockets (TCPSocket
, UDPSocket
, TCPServerSocket
) behind/in Isolated Web App, with all sorts of CSP-COOP-COEP headers and Signed Web Bundles that are supposed to isolate the web application. The idea being that Direct Sockets is a "powerful feature" that has "security" concerns. Forget about providing users a means to set which origins Direct Sockets are exposed on. Because, in general, I don't think specification authors and maintainers trust users to make those decisions.
Well, I promptly broke out of the Isolated Web App sandbox - using other Web API's https://github.com/guest271314/telnet-client/tree/user-defined-tcpsocket-controller-web-api, and proceeded to open a window
on any arbitrary Web page I decided upon in order to use TCPSocket
for local communications. Of course using browser extension code breaking out of alleged sandboxes is more direct. I used Web API's to demonstrate there really is n way to isolate a window
on the Web.
Just provide a means for users, developers in the field, decide which global methods or API's they want to expose in AudioWorkletGlobalScope
, instead of a top-down hierarchical approach to maintaining the specification without any options for users to decide how they want to use the API.
The problem with excluding methods from an API is users will nonetheless, inevitably, want to use those methods in the API.
This is not the case. It is a mistake to use those APIs in a real-time context such as the AudioWorkletGlobalScope
, because those APIs aren't real-time safe.
This is not the case. It is a mistake to use those APIs in a real-time context such as the
AudioWorkletGlobalScope
, because those APIs aren't real-time safe.
Based on what tests and empirical evidence that I can vet myself?
This is not the case. It is a mistake to use those APIs in a real-time context such as the
AudioWorkletGlobalScope
, because those APIs aren't real-time safe.
I think you are misunderstanding how asynchronous code can be useful in a real-time context.
Right now I have to transfer a ReadableStream
where multiple fetch()
requests are piped to the single stream from a Worker
to the AudioWorkletContext
.
When the stream is read the data is written to a Uint8Array
then audio is output to the destination through the AudioWorklet
.
I am trying to get rid of the Worker
creation, and transferring the ReadableStream
from the Worker
or other context to the AudioWorklet
and just make the requests, process the data, in the AudioWorkletGlobalScope
.
I don't see why this can't be tested. The worst that can happen is we derive proof, rather than conjecture that AudioWorkletProcessor
real-time processing will be impacted by asynchronous code running outside of process() {}
will negatively impact process() {}
performance. I think it is fair to test the hypothesis and conjecture, and get some facts we can go over.
To be clear I'm talking about exposing certain asynchronous globals in AudioWorkletGlobalScope
, nor AudioWorkletProcessor
scope.
The reverse of port
defined in AudioWorkletProcessor
not in AudioWorkletGlobalScope
.
I'm not talking about trying to use fetch()
, WebTransport
, Response
, or import()
in process()
.
Based on what tests and empirical evidence that I can vet myself?
Real-time safety can be assessed theoretically in the cases we care about here, by simply thinking about the constructs in use in the AudioWorkletGlobalScope
. https://en.wikipedia.org/wiki/Real-time_computing explain what real-time safety means. Any code in AudioWorkletGlobalScope
should be real-time-safe. An exception is when the AudioContext
is suspended, in which case some often preparatory and otherwise non-real-time code can run.
It follows that anything but the simplest amount of necessary computations is to be done on a different thread than the thread the AudioWorkletGlobaleScope
uses. Generally, it's reading and writing from memory, and performing computations, often digital signal processing. Any allocations, lock usage, system calls (including any for of network or file IO) is proscribed. Anything that can be done in a different scope is better done there. Text manipulation or anything of that sort have no place in this scope. On the web platform, we need to also add anything that can create JavaScript objects, because this causes garbage-collection to happen, and garbage-collection in JavaScript engines are very optimized, but not real-time safe.
https://paul.cx/public/karplus-stress-tester/ is an empirical test you can run, source is available. https://blog.paul.cx/post/a-wait-free-spsc-ringbuffer-for-the-web/ is a discussion on this. You're welcome to write your own tests, you'll find the same result. This article shows how to precisely assess the performance you're getting.
As said in the article above, communication to and from this scope is better done using wait or lock-free data structure, that are real-time-safe, as widely documented in the literature, but this classic article is a good introduction to the entire field.
And we've tested asynchronous code extensively while implementing, it's a disaster for performance and causes glitches in the simplest case. It's always possible to write bad code on any platform, but if we can, it's better to not allow it.
Real-time safety can be assessed theoretically in the cases we care about here, by simply thinking about the constructs in use in the AudioWorkletGlobalScope.
No, you can't theoretically imagine anything. You have to test to get empirical results.
Per my theory nothing disasterous is going to happen by exposing fetch()
in AudioWorkletGlobalScope
.
You have no evidence to refute my theory.
None of the links you provided expose and test fetch()
in AudioWorkletGlobalScope
so I'm not sure what weight you are assigning to them to support your theory.
, it's better to not allow it.
That's just an opinion, devoid of accompanying facts and tests I can run locally to verify your theory.
That's how science works.
Nothing disaterous can happen by testing your theory, which you clearly have not done.
I find it remarkable that people involed in audio are so conservative that they won't even test their own pr other peoples' theories to actually get facts, istead rely on their own opinion, and no more.
Since streams are transferable, I already transfer a ReabaleStream
into the AUdioWorkletGlobalScope
, and read the data asynchronous;y in the AUdioWorkletGlobalScope
which I access in process(){}
. So your theory doesn't match the reality of what I do right now. I'm just trying to skip the transferring part.
Per your theory something should be going wrong here; gaps, glitches, something amiss.
That ain't happening. See all that asynchronous code below?
We should be able to fetch()
right in the AusioWorkletGlobalScope
.
import { CODECS } from './codecs.js';
class AudioDataWorkletStream extends AudioWorkletProcessor {
constructor(options) {
super(options);
this.buffer = new ArrayBuffer(70982024, {maxByteLength: 70982024});
Object.assign(this, options.processorOptions, {
uint8: new Uint8Array(this.buffer),
});
this.started = false;
this.port.onmessage = this.appendBuffers.bind(this);
}
async appendBuffers({ data: { readable } }) {
globalThis.console.log(currentTime, currentFrame);
let index = 0;
const reader = readable.getReader();
const stream = async ({ value, done }) => {
if (done) {
await reader.closed;
return 'read/write done';
}
for (let i = !index ? 44 : 0; i < value.length; i++) {
this.uint8[index++] = value[i];
// accumulate 1 second of data
// to avoid glitches at beginning of playback
if (index === 44100) {
this.port.postMessage({ start: true });
}
}
return stream(await reader.read());
};
const processStream = await stream(await reader.read());
console.log(processStream, currentTime, currentFrame);
Object.assign(this, { readable });
}
endOfStream() {
this.port.postMessage({
ended: true,
currentTime,
currentFrame,
});
this.buffer.resize(0);
console.log(this.buffer);
}
process(inputs, outputs) {
if (this.offset >= this.uint8.length) {
this.endOfStream();
return false;
}
const channels = outputs.flat();
const uint8 = new Uint8Array(512);
for (let i = 0; i < 512; i++, this.offset++) {
if (this.offset >= this.uint8.length) {
break;
}
uint8[i] = this.uint8[this.offset];
}
const uint16 = new Uint16Array(uint8.buffer);
CODECS.get(this.codec)(uint16, channels);
return true;
}
}
registerProcessor('audio-data-worklet-stream', AudioDataWorkletStream);
2023 TPAC Audio WG Discussion: The WG decided to remove some globally exposed interfaces, but #2499 (comment) suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.
Pinging implementors to get more thoughts: @padenot @jyavenard
Was there any further discussion or conclusion on which interfaces exactly should not be exposed?
Coming from Wasm here it would be good to know if TextDecoder
and TextEncoder
will be supported or not.
We do not have a comprehensive list of blocked APIs yet, but these are good questions to ask:
1) Does it do any significant memory allocation within its operation? 2) Does it have any unbound thread blocking operation?
If the answer to them is yes, then it is likely to be excluded from the AudioWorkletGlobalScope.
@padenot has pointed out in Mozilla's Bugzilla, that any allocation is bad and considering there is already a non-allocating API for TextEncoder
(TextEncoder.encodeInto()
), that one could be exposed instead.
So the proposal is to expose TextDecoder
fully and TextEncoder
partly (exclude TextEncoder.encode()
).
It was suggested to ask for clarification in WHATWG Encoding, which I would like to do after getting some official statement here. Does that sound like an appropriate plan?
Describe the issue
There are several globals such as
TextEncoder
,TextDecoder
andURL
which seem to have been intentionally left out ofAudioWorkletGlobalScope
. However, these are specified with[Exposed=*]
in WebIDL, which is defined as:The definition of
AudioWorkletGlobalScope
uses[Global]
, so should technically contain these:To be clear, I'm not saying they should be added to
AudioWorkletGlobalScope
; I'm saying that it needs to be specified somewhere that they aren't there.It's possible I may have just missed a note about this somewhere in the spec, but I couldn't find anything.
Where Is It
I'm not entirely sure whether this should be fixed in the Web Audio spec itself or the specs which define these globals with
[Exposed=*]
.Some possibilities, though:
AudioWorkletGlobalScope
WorkletGlobalScope
TextEncoder
TextEncoder
URL