WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.05k stars 166 forks source link

Globals specified with `[Exposed=*]` not present in `AudioWorkletGlobalScope` #2499

Open Liamolucko opened 2 years ago

Liamolucko commented 2 years ago

Describe the issue

There are several globals such as TextEncoder, TextDecoder and URL which seem to have been intentionally left out of AudioWorkletGlobalScope. However, these are specified with [Exposed=*] in WebIDL, which is defined as:

The definition of AudioWorkletGlobalScope uses [Global], so should technically contain these:

[Global=(Worklet, AudioWorklet), Exposed=AudioWorklet]
interface AudioWorkletGlobalScope : WorkletGlobalScope {
  undefined registerProcessor (DOMString name,
                               AudioWorkletProcessorConstructor processorCtor);
  readonly attribute unsigned long long currentFrame;
  readonly attribute double currentTime;
  readonly attribute float sampleRate;
};

To be clear, I'm not saying they should be added to AudioWorkletGlobalScope; I'm saying that it needs to be specified somewhere that they aren't there.

It's possible I may have just missed a note about this somewhere in the spec, but I couldn't find anything.

Where Is It

I'm not entirely sure whether this should be fixed in the Web Audio spec itself or the specs which define these globals with [Exposed=*].

Some possibilities, though:

hoch commented 2 years ago

As you mentioned, the WG's intention is to limit the exposure of AudioWorkletGlobalScope only for the worklet-based audio processing. I am not sure if there's a way to notate the exception of specific Globals using WebIDL.

It also seems like there's inconsistency between the spec and the actual implementation: Chrome's IDL FireFox's IDL

We'll triage this issue in the next teleconference and think about solution.

guest271314 commented 2 years ago

See also https://github.com/WebAudio/web-audio-api-v2/issues/79 re WebAssembly.compileStreaming() and WebAssembly.instantiateStreaming(). Both are defined/exposed on the WebAssembly object in AudioWorkletGlobalScope, neither works as intended because Response is not exposed in AudioWorkletGlobalScope.

Although OP states

To be clear, I'm not saying they should be added to AudioWorkletGlobalScope; I'm saying that it needs to be specified somewhere that they aren't there.

for completeness an encoding polyfill can be used. I modified https://github.com/inexorabletash/text-encoding/blob/master/lib/encoding.js in order to import TextEncoder and TextDecoder into a QuickJS application by substituting globalThis for this and including this at the end of the script

const {TextEncoder, TextDecoder} = globalThis;
export {TextEncoder, TextDecoder};

which we can import into AudioWorkletGlobalScope

import {TextEncoder, TextDecoder} from './encoding.js';
console.log(TextEncoder, TextDecoder);
const encoder = new TextEncoder();
const decoder = new TextDecoder();
let data = encoder.encode('abc');
console.log(data);
console.log(decoder.decode(data));

encoding.js.zip. There are polyfills for URL, e.g. https://github.com/lifaon74/url-polyfill, which can be adjusted and imported similarly. Polyfilling Response is more involved.

guest271314 commented 2 years ago

Technically we can encode and decode strings with the following alone

let encoded = new Uint8Array([97, 98, 99]);
let str = String.fromCharCode(...encoded);
let encode = new Uint8Array([...str].map((s) => s.charCodeAt()));
console.log({encoded, str, encode});
hoch commented 2 years ago

@Liamolucko It looks like we need to open issues against TextEncoder and URL specs. Could you do that?

inexorabletash commented 1 year ago

FYI, TextEncoder/TextDecoder were made [Exposed=*] here https://github.com/whatwg/encoding/commit/f09648a96ef4cffbf68734a5f81b7ececf2e311e with discussion at https://github.com/tc39/proposal-shadowrealm/issues/331 - TL;DR there are a set of APIs specified as Web APIs that could just as easily have been specified as parts of ECMAScript (streams, encoding, URL, etc) and it was felt that these should be present in every context.

So... might require discussion w/ TC39 folks etc. if you want to revisit that.

inexorabletash commented 1 year ago

More specifically: if you have a reason to exclude these APIs, it would follow there are parts of ECMAScript that perhaps you'd also want to exclude (e.g. parts of ECMA-402 Internationalization?), now or in the future.

hoch commented 1 year ago

That's a good point. Thanks @inexorabletash!

hoch commented 1 year ago

2023 TPAC Audio WG Discussion: The WG decided to remove some globally exposed interfaces, but https://github.com/WebAudio/web-audio-api/issues/2499#issuecomment-1716624415 suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.

Pinging implementors to get more thoughts: @padenot @jyavenard

guest271314 commented 1 year ago

I think several Web API's are not exposed/defined in AudioWorkletGlobalScope, just because. Not based on any actual evidence that said non-exposed methods or Web API's will negatively impact AudioWorkletProcessor.

There should be some empirical evidence to based decisions on, rather than just mere speculation, or arbitrarily excluding Web API's for no reason that can be substantiated by facts.

E.g.,

to name a few.

Somebody else brought up atob().

I don't see any reason why those global methods should not be defined.

guest271314 commented 1 year ago

suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.

The problem with excluding methods from an API is users will nonetheless, inevitably, want to use those methods in the API.

An example of the exclusion or gating of API's can be observed with Chrome's decision to gate Direct Sockets (TCPSocket, UDPSocket, TCPServerSocket) behind/in Isolated Web App, with all sorts of CSP-COOP-COEP headers and Signed Web Bundles that are supposed to isolate the web application. The idea being that Direct Sockets is a "powerful feature" that has "security" concerns. Forget about providing users a means to set which origins Direct Sockets are exposed on. Because, in general, I don't think specification authors and maintainers trust users to make those decisions.

Well, I promptly broke out of the Isolated Web App sandbox - using other Web API's https://github.com/guest271314/telnet-client/tree/user-defined-tcpsocket-controller-web-api, and proceeded to open a window on any arbitrary Web page I decided upon in order to use TCPSocket for local communications. Of course using browser extension code breaking out of alleged sandboxes is more direct. I used Web API's to demonstrate there really is n way to isolate a window on the Web.

Just provide a means for users, developers in the field, decide which global methods or API's they want to expose in AudioWorkletGlobalScope, instead of a top-down hierarchical approach to maintaining the specification without any options for users to decide how they want to use the API.

padenot commented 1 year ago

The problem with excluding methods from an API is users will nonetheless, inevitably, want to use those methods in the API.

This is not the case. It is a mistake to use those APIs in a real-time context such as the AudioWorkletGlobalScope, because those APIs aren't real-time safe.

guest271314 commented 1 year ago

This is not the case. It is a mistake to use those APIs in a real-time context such as the AudioWorkletGlobalScope, because those APIs aren't real-time safe.

Based on what tests and empirical evidence that I can vet myself?

guest271314 commented 1 year ago

This is not the case. It is a mistake to use those APIs in a real-time context such as the AudioWorkletGlobalScope, because those APIs aren't real-time safe.

I think you are misunderstanding how asynchronous code can be useful in a real-time context.

Right now I have to transfer a ReadableStream where multiple fetch() requests are piped to the single stream from a Worker to the AudioWorkletContext.

When the stream is read the data is written to a Uint8Array then audio is output to the destination through the AudioWorklet.

I am trying to get rid of the Worker creation, and transferring the ReadableStream from the Worker or other context to the AudioWorklet and just make the requests, process the data, in the AudioWorkletGlobalScope.

I don't see why this can't be tested. The worst that can happen is we derive proof, rather than conjecture that AudioWorkletProcessor real-time processing will be impacted by asynchronous code running outside of process() {} will negatively impact process() {} performance. I think it is fair to test the hypothesis and conjecture, and get some facts we can go over.

guest271314 commented 1 year ago

To be clear I'm talking about exposing certain asynchronous globals in AudioWorkletGlobalScope, nor AudioWorkletProcessor scope.

The reverse of port defined in AudioWorkletProcessor not in AudioWorkletGlobalScope.

I'm not talking about trying to use fetch(), WebTransport, Response, or import() in process().

padenot commented 1 year ago

Based on what tests and empirical evidence that I can vet myself?

Real-time safety can be assessed theoretically in the cases we care about here, by simply thinking about the constructs in use in the AudioWorkletGlobalScope. https://en.wikipedia.org/wiki/Real-time_computing explain what real-time safety means. Any code in AudioWorkletGlobalScope should be real-time-safe. An exception is when the AudioContext is suspended, in which case some often preparatory and otherwise non-real-time code can run.

It follows that anything but the simplest amount of necessary computations is to be done on a different thread than the thread the AudioWorkletGlobaleScope uses. Generally, it's reading and writing from memory, and performing computations, often digital signal processing. Any allocations, lock usage, system calls (including any for of network or file IO) is proscribed. Anything that can be done in a different scope is better done there. Text manipulation or anything of that sort have no place in this scope. On the web platform, we need to also add anything that can create JavaScript objects, because this causes garbage-collection to happen, and garbage-collection in JavaScript engines are very optimized, but not real-time safe.

https://paul.cx/public/karplus-stress-tester/ is an empirical test you can run, source is available. https://blog.paul.cx/post/a-wait-free-spsc-ringbuffer-for-the-web/ is a discussion on this. You're welcome to write your own tests, you'll find the same result. This article shows how to precisely assess the performance you're getting.

As said in the article above, communication to and from this scope is better done using wait or lock-free data structure, that are real-time-safe, as widely documented in the literature, but this classic article is a good introduction to the entire field.

And we've tested asynchronous code extensively while implementing, it's a disaster for performance and causes glitches in the simplest case. It's always possible to write bad code on any platform, but if we can, it's better to not allow it.

guest271314 commented 1 year ago

Real-time safety can be assessed theoretically in the cases we care about here, by simply thinking about the constructs in use in the AudioWorkletGlobalScope.

No, you can't theoretically imagine anything. You have to test to get empirical results.

Per my theory nothing disasterous is going to happen by exposing fetch() in AudioWorkletGlobalScope.

You have no evidence to refute my theory.

None of the links you provided expose and test fetch() in AudioWorkletGlobalScope so I'm not sure what weight you are assigning to them to support your theory.

, it's better to not allow it.

That's just an opinion, devoid of accompanying facts and tests I can run locally to verify your theory.

That's how science works.

Nothing disaterous can happen by testing your theory, which you clearly have not done.

I find it remarkable that people involed in audio are so conservative that they won't even test their own pr other peoples' theories to actually get facts, istead rely on their own opinion, and no more.

Since streams are transferable, I already transfer a ReabaleStream into the AUdioWorkletGlobalScope, and read the data asynchronous;y in the AUdioWorkletGlobalScope which I access in process(){}. So your theory doesn't match the reality of what I do right now. I'm just trying to skip the transferring part.

Per your theory something should be going wrong here; gaps, glitches, something amiss.

That ain't happening. See all that asynchronous code below?

We should be able to fetch() right in the AusioWorkletGlobalScope.

import { CODECS } from './codecs.js';
class AudioDataWorkletStream extends AudioWorkletProcessor {
  constructor(options) {
    super(options);  
    this.buffer = new ArrayBuffer(70982024, {maxByteLength: 70982024});
    Object.assign(this, options.processorOptions, {
      uint8: new Uint8Array(this.buffer),
    });
    this.started = false;
    this.port.onmessage = this.appendBuffers.bind(this);
  }
  async appendBuffers({ data: { readable } }) {
    globalThis.console.log(currentTime, currentFrame);
    let index = 0;
    const reader = readable.getReader();
    const stream = async ({ value, done }) => {
      if (done) {
        await reader.closed;
        return 'read/write done';
      }
      for (let i = !index ? 44 : 0; i < value.length; i++) {
        this.uint8[index++] = value[i];
        // accumulate 1 second of data
        // to avoid glitches at beginning of playback
        if (index === 44100) {
          this.port.postMessage({ start: true });
        }
      }
      return stream(await reader.read());
    };
    const processStream = await stream(await reader.read());
    console.log(processStream, currentTime, currentFrame);
    Object.assign(this, { readable });
  }
  endOfStream() {
    this.port.postMessage({
      ended: true,
      currentTime,
      currentFrame,
    });
    this.buffer.resize(0);
    console.log(this.buffer);
  }
  process(inputs, outputs) {
    if (this.offset >= this.uint8.length) {
      this.endOfStream();
      return false;
    }
    const channels = outputs.flat();
    const uint8 = new Uint8Array(512);
    for (let i = 0; i < 512; i++, this.offset++) {
      if (this.offset >= this.uint8.length) {
        break;
      }
      uint8[i] = this.uint8[this.offset];
    }
    const uint16 = new Uint16Array(uint8.buffer);
    CODECS.get(this.codec)(uint16, channels);
    return true;
  }
}
registerProcessor('audio-data-worklet-stream', AudioDataWorkletStream);
daxpedda commented 7 months ago

2023 TPAC Audio WG Discussion: The WG decided to remove some globally exposed interfaces, but #2499 (comment) suggested that a wider discussion might be needed to exclude them from the AudioWorkletGlobalScope.

Pinging implementors to get more thoughts: @padenot @jyavenard

Was there any further discussion or conclusion on which interfaces exactly should not be exposed?

Coming from Wasm here it would be good to know if TextDecoder and TextEncoder will be supported or not.

hoch commented 7 months ago

We do not have a comprehensive list of blocked APIs yet, but these are good questions to ask:

1) Does it do any significant memory allocation within its operation? 2) Does it have any unbound thread blocking operation?

If the answer to them is yes, then it is likely to be excluded from the AudioWorkletGlobalScope.

daxpedda commented 6 months ago

@padenot has pointed out in Mozilla's Bugzilla, that any allocation is bad and considering there is already a non-allocating API for TextEncoder (TextEncoder.encodeInto()), that one could be exposed instead.

So the proposal is to expose TextDecoder fully and TextEncoder partly (exclude TextEncoder.encode()).

It was suggested to ask for clarification in WHATWG Encoding, which I would like to do after getting some official statement here. Does that sound like an appropriate plan?

padenot commented 5 months ago

Related: https://github.com/mozilla/standards-positions/issues/997 https://github.com/tc39/proposal-shadowrealm/issues/401