API proposal to preemptively determine audio thread overload (Render Capacity)

JohnWeisz commented 6 years ago

The Web Audio API defines audio production apps as a supported use-case, such as wave editors, digital audio workstations, and the like. And it's indeed quite adequate at the task!

One common trait of these applications is the capability of accepting virtually unlimited user content, which will essentially always result in hitting the limit of audio processing capabilities on a given machine at one point -- i.e. there is so many audio content that the audio thread simply cannot keep up with processing everything, it becomes overloaded, no matter how well it is optimized.

I believe this is widely and well understood in the audio production industry, and usually the solution to help the user avoid overloading is displaying a warning indicator of some kind, letting the user know the audio processing thread is about to be overloaded, and audio glitching will occur (so the user can know they should go easy on adding more contents).

Note: in native apps (mostly C++ based), this is most commonly implemented as a CPU load meter (for the audio thread's core), which you can keep your eye on to know how far you are from the limit, roughly.

Currently, the Web Audio API does not expose a comparable API to facilitate monitoring audio processing load, or overload.

It's possible to figure it out, mainly in special cases with above-web-standard privileges (such as an Electron app). However, this is quite difficult to get right (even from native C++ side) without implementations taking a spec-defined standard into consideration.

I'd wish to propose a small set of light, straightforward, low-privacy implication API additions to enable this:

audioContext.isOverloaded(); // 'true' or 'false'

audioContext.addEventListener("overloadchange", function (e) {
    e.isOverloaded; // 'true' or 'false'
});

For obvious reasons, this is an extension to the AudioContext interface, not BaseAudioContext, as overload detection is not applicable for OfflineAudioContext processing. Having an event dedicated for the same purpose avoids the need of a polling check.

It is up to implementations to decide how exactly it is determined whether the audio thread is considered overloaded or not, optionally taking the AudioContext latencyHint setting into consideration.

This would enable Web Audio API-based apps let the user know about high audio thread load, and display possible options or hints at steps for the user to take to avoid audio glitching.

Privacy implications

Exposing this information should have little to no privacy implications, as (1) it is rarely clear why exactly the audio thread is overloaded (it could be due to low device capabilities, or high CPU use by other processes), and (2) it does not provide a more accurate way to determine device capabilities than what is already possible with a simple scripted benchmark.

rtoy commented 6 years ago

I think this is really hard to specify in any kind of meaningful way, let alone some kind of meaningful inter-operable way.

It's really hard to know if something is overloaded until after the fact, which is too late for your use case. This is further complicated by the fact that the audio graph can be quite dynamic. So you might have a nice graph working at 1% CPU. Then you connect in a very complex subgraph suddenly that would want to use 200% CPU. Is it ok to signal overloaded after the fact?

hoch commented 6 years ago

Here's something we can do: A = The time duration of audio device callback interval. B = The time spent on rendering a render quantum. Then the UA can give users the ratio between B and A when queried. But the information should (or MUST?) be encoded into a coarse form to minimize the room for the fingerprint abuse. (Perhaps categories like very low, low, moderate, high, very high. Potentially with some random jitter on the polling timing.) Of course the heuristic and the transition between these categories need discussion as well.

I believe the metaphor on DAW's CPU bar is somewhat inappropriate, because the CPU gauge on GUI is not an exposed programmable API. What's being asked here is a JS API that can be abused programmatically for so many different reasons, so we should tread carefully.

In general, I support this idea if we can pass the security/privacy review.

JohnWeisz commented 6 years ago

@rtoy

It's really hard to know if something is overloaded until after the fact, which is too late for your use case. This is further complicated by the fact that the audio graph can be quite dynamic.

Thinking about it some more, I think the following idea could be a (relatively) accurate approximation, although this is an implementation detail.

Since the audio rendering thread is a single thread (for the most part), and there are platform specific APIs to determine which CPU core a given thread is running on, we compute the average load of this CPU core (e.g. over the last second or so) and determine whether it is above a certain threshold.

Say this is done on a one second poll check by default. When an AudioNode is joined into the processing graph, the polling is done immediately, or perhaps with a slight debounce. This can detect both steady and immediate changes in audio thread load.

@hoch

But the information should (or MUST?) be encoded into a coarse form to minimize the room for the fingerprint abuse. (Perhaps categories like very low, low, moderate, high, very high. Potentially with some random jitter on the polling timing.) Of course the heuristic and the transition between these categories need discussion as well. [...] In general, I support this idea if we can pass the security/privacy review.

Interesting insights, this is primarily the reason I proposed a simple true/false overload detect (instead of a load percentage), which should provide a far smaller surface for fingerprinting, especially if user agents may add a random jitter.

JohnWeisz commented 6 years ago

@rtoy

Additionally, this API -- by definition -- cannot always hard-protect against audio thread overloading and glitching, there are way too many factors for that (such as a heavy background process starting on the same machine, bad app response to audio thread overloading, or even the immediate addition of a large sub-graph as you mentioned).

This API would merely help avoid glitching in many (perhaps most) common cases, and help apps provide measures to help the user actively avoid glitching.

For this reason, I agree the API naming isOverloaded could be slightly misleading, as we are talking about an audio thread that's almost overloaded (simply detecting whether the audio thread is already overloaded wouldn't help too much in this case).

Perhaps it would be better to call it isOverloading (instead of, or in addition to isOverloaded):

audioContext.isOverloading(); // 'true' or 'false'

audioContext.addEventListener("overloadchange", function (e) {
    e.isOverloading; // 'true' or 'false'
});

rtoy commented 6 years ago

I think a binary isOverloaded will almost never work because the overload point won't match the apps developers expectation. A coarse CPU percentage that @hoch proposes at least allows the developer to decide what overload means.

Which is not to say I support this idea. of overload :-) I'd much rather not do this and let each browser integrate WebAudio into the browser's developer tools. Then we can provide much more information including things like overload, but also how much cpu each node in the graph is using or mark out the heaviest subgraph or whatever else might be useful for analyzing your application. And no spec change required which means these can be iterated much faster.

JohnWeisz commented 6 years ago

I'd much rather not do this and let each browser integrate WebAudio into the browser's developer tools. Then we can provide much more information including things like overload, but also how much cpu each node in the graph is using or mark out the heaviest subgraph or whatever else might be useful for analyzing your application.

How would this solve the problem of audio thread overload detection on the end-user's machine?

The threshold of hitting max load depends greatly on device capabilities and "ambient" CPU load (e.g. background tasks), and is practically impossible to know or even approximate in advance (i.e. a stronger machine will be able to handle more content than what is determined on a dev machine, while a weaker one will only handle less content in general).

This is not a question of optimizing the audio graph at development time (that's pretty feasible to do already), but a question of how much audio processing the user can do before hitting max load, on the user's machine.

rtoy commented 6 years ago

Yeah, it doesn't really help. It helps you as a developer to know what's using up CPU so you can make intelligent trade-offs, but doesn't help you to know if a particular machine is having problems.

I just don't know what that metric would be and making it meaningful in all the different possibilities that cause audio glitches.

andrewmacp commented 6 years ago

I agree that it would be really valuable to have some kind of API to expose this information, whether it's @hoch's coarse CPU use or @JohnWeisz's callback. After-the-fact could even be fine I think for our use cases. Better integration with dev tools would also be really great but an API would in addition allow dynamically reconfiguring/modifying the graph as we approach the resource limits on a given client.

JohnWeisz commented 6 years ago

To clarify, our use case would be simply displaying a warning notification about the user approaching maximum audio thread capacity -- as mentioned before, it's a DAW with no artificial limitation on the amount of content, and it would be great if the user was given an indication about audio overload, other than an already glitching audio.

This, in essence, is similar to the CPU-use meter, often present in similar desktop software, but for the audio thread only.

rtoy commented 6 years ago

Here is an alternative that you can implement now without waiting for the spec to get updated.

Create an offline context with an appropriately heavy graph. Time how long it takes to render. There's your CPU usage meter. You could probably do this for each DAW component to get an idea of the expense of adding the component. You won't get the feedback that built-in cpu meter would give, but perhaps good enough.

karlt commented 6 years ago

One thing to bear in mind with such an API is that it is generally the rendering quantum that takes the most cpu that will be most likely to cause a break in the output. I'd imagine an API similar to the one proposed by hoch in https://github.com/WebAudio/web-audio-api/issues/1595#issuecomment-386123070 but reporting a maximum ratio over some time period.

JohnWeisz commented 6 years ago

@rtoy

Here is an alternative [...] Create an offline context with an appropriately heavy graph. Time how long it takes to render. There's your CPU usage meter.

The problem with this approach is that it lags behind significantly, and rapidly allocating OfflineAudioContext has a significant performance overhead. Worse, it will run on a different thread than the real-time audio thread (i.e. AudioContext), and will merely give a generic hint at what the CPU usage can be on some core (which may or may not be the same as the one used by AudioContext).

Currently, the easiest but still relatively accurate approach is monitoring the load of each CPU core (this requires above-web-standard privileges, such as Electron), and considering the highest load as the CPU load. This often results in false positives, but if the audio thread is indeed approaching overload, it will detect it.

mr21 commented 6 years ago

@JohnWeisz what is the priority for your application, live or rendering?

If it's rendering, we could ask for an audioContext method to modify in live the BaseAudioContext.sampleRate, having this possibility could be great, the user will choose in live when to downgrade the audio to avoid glitchs.

No need to have a measure. And maybe this can be done.

rtoy commented 6 years ago

Do you mean changing the sample rate of the context after it's been created? Switching the sample rate can cause glitches from the switch, and you still have to resample the audio if the hardware doesn't support the new rate. The resampling introduces additional delay and CPU usage. That might not be what the user wants, and it's kind of hard to know what the additional delay and CPU would be.

mr21 commented 6 years ago

@rtoy, ah... and is there no way to reduce the context quality after created? I was thinking that reducing the sampleRate was something like reducing the resolution of a video.

rtoy commented 6 years ago

Currently no, there's no way. And yes, reducing the sample rate is kind of like reducing the video resolution. But we still have to resample to the HW rate, and it's not clear what the right tradeoff should be. High quality output (with high CPU usage and possibly significant additional latency)? Low quality output (with lower CPU usage and possibly less additional latency)? Something in between?

It's hard to know what the user wants when just switching sample rates. This is one reason Chrome has not yet implemented the AudioContext sampleRate option; we don't know what the right tradeoff should be, but Chrome generally prefers high quality when resampling is needed.

JohnWeisz commented 6 years ago

@Mr21 It's both live and rendering (you can play your project in real time, or render to an audio file), but AFAIK, rendering cannot glitch ever, as it will simply render slower if available CPU power is insufficient. This is only applicable to live playback, where there is only a limited time-frame to process a render quantum.

mr21 commented 6 years ago

@JohnWeisz, yes it's what i've though. @rtoy, understood. In general, does the hardware rate can be modified (with the drivers or something) or it's clearly impossible?

JohnWeisz commented 6 years ago

In case the binary isOverloading reading is deemed insufficient and an exact CPU-usage reading having too severe privacy implications, Sherlock here has an alternative idea:

audioContext.getOverloadState(); // "none" | "moderate" | "critical"

audioContext.addEventListener("overloadchange", (e) => {
    e.overloadState; // "none" | "moderate" | "critical"
});

Where moderate means a considerably used audio thread, and critical means glitching is very likely. How exactly this is determined is still up for implementation, but in general, polling the CPU load of the core on which the audio thread runs should be rather adequate (>70% + jitter for moderate, >90% + jitter for critical).

While the exact numbers can vary from device to device, this is something we found considerably accurate to detect overload in advance (in our Electron-based distribution, which can query CPU load accurately).

rtoy commented 6 years ago

I like @hoch's idea: Just return a CPU percentage, rounded to say the nearest 10% ro 20% or something coarse, but not too coarse. Then we don't have to make random assumptions on the meaning of "none", "moderate", or "critical".

hoch commented 6 years ago

How exactly this is determined is still up for implementation

So now developers need to handle unpredictable difference between devices/browser. The sniffing part will be really ugly. I did not use the term "CPU percentage" for this specific reason. As I suggested above, the ratio between the render timing budget and the time spent on rendering is quite clear and there's no room for creative interpretation.

IMO the sensible form for this attribute is:

enum RenderPerformanceCategory {
  "none",
  "moderate",
  "critical"
}

partial interface AudioContext : BaseAudioContext {
  // renderPerformance = timeSpentOnRenderQuantum / callbackInterval
  readonly attribute RenderPerformanceCategory renderPerformance;
}

Concerns

This technically gives you an estimate of the performance of real-time priority thread provided by OS. That is why I believe the more coarse the better. Even with 3 categories, you can infer 3 classes of CPU powers from all of visitors. This is why the web platform does not have any API like this.

const context = new AudioContext();
const mutedGain = new GainNode(context, { gain: 0.0 });
mutedGain.connect(context.destination);
let oscCounter = 0;

function harassAndCheck() {
  const osc = new OscillatorNode(context);
  osc.connect(mutedGain);
  osc.start();
  oscCounter++;
  if (context.renderPerformance !== 'critical') {
    setTimeout(AbuseAndCheck, 10);
  } else {
    console.log('I can run ' + oscCounter + ' oscillators!');
  }
}

harassAndCheck();

If you want to have a more fine-grained estimation, gain nodes can be exploited instead of oscillators. If you want to detect it fast, go with convolvers. So the coarseness of categories can be extrapolated by using various features in Web Audio API.

Here I am not trying to discourage the effort or anything, but simply want to check all the boxes until we reach the conclusion. Personally I want to have this feature in the API.

JohnWeisz commented 6 years ago

@hoch

This technically gives you an estimate of the performance of real-time priority thread provided by OS.

I believe the main question here is whether the snippet you demonstrated can expose more information than what is already available.

Unless I'm mistaken, AudioWorklet can be used to achieve a rather accurate measure, as the processing script is executed on the exact same real-time priority audio thread. And AudioWorklet is already implemented, and shipped.

So again, unless I'm mistaken, running a benchmark inside the audio worklet processor should give significantly more accuracy than what you could ever get with the getOverloadState() or renderPerformance API, especially if we consider that native nodes (e.g. OscillatorNode) have varying implementations (and as such, potentially varying performance), while a processing script running inside AudioWorklet should in theory have very similar results across implementations.

hoch commented 6 years ago

I believe AudioWorklet can be used to achieve a rather accurate measure

Can you elaborate? I am curious about what can be done with the AudioWorklet. If that's possible, we can actually build a prototype with the AW first. On the other hand, the AW might have some privacy concerns we have not explored.

AFAIK, the processing script is executed on the exact same real-time priority audio thread.

This is not true for Chrome; the AudioWorklet requires the JS engine on the rendering thread, so it actually degrades the thread priority when it is activated by .addModule() call. I am quite certain that it will be similar to all the browsers.

hoch commented 6 years ago

My team mates suggested to involve the privacy expert in the early stage of discussion and to review some cases like this.

@lknik Could you take a look at this? Would love to have your opinion here.

mr21 commented 6 years ago

ahhh... indeed this could increase our fingerprint on the web by measuring the audiocard performance :/ But anyway, how a PWA could not be able (technically) to do a fingerprint by benchmarks a random user?

JohnWeisz commented 6 years ago

@Mr21 The primary privacy concern here is that this proposal would allow fingerprinting with a considerably greater accuracy, as you are indirectly benchmarking with the help of a real-time priority thread. This is much less likely to be affected by other running programs on the user's system, and your fingerprint will be potentially more accurate than a simple scripted benchmark, with who knows how many interrupts in between instructions.

rtoy commented 6 years ago

AudioWorklets should not run on a real-time priority thread. Chrome's worklet definitely doesn't.

JohnWeisz commented 6 years ago

I am curious about what can be done with the AudioWorklet. If that's possible, we can actually build a prototype with the AW first.

I don't quite expect AudioWorklet to be that useful for determining CPU load in practice (perhaps the best we can do is measure how much looping is needed before glitching out), but I was under the impression it was useful to benchmark the hardware itself (in this case, the renderPerformance API wouldn't introduce a new fingerprint surface).

This is not true for Chrome; the AudioWorklet requires the JS engine on the rendering thread, so it actually degrades the thread priority when it is activated by .addModule() call. I am quite certain that it will be similar to all the browsers.

If I understand right, this seems to be relevant (469639 referenced within is non-public, so I can only guess).

Then I was indeed mistaken (but ouch, this also means an audio worklet processor has a permanent potential stability impact just by being loaded?). In this case, discard my previous assumptions, AudioWorklet is clearly not useful for benchmarking a real-time priority thread (according to the linked issue, only a "display-priority" one), and might not be considerably better than a simple scripted benchmark on the main thread. I should've studied the implementation more in depth.

hoch commented 6 years ago

in this case, the renderPerformance API wouldn't introduce a new fingerprint surface

Except that it does expose something new! If the battery status API can be used as a vector for side-channel attack, the renderPerformance API can certainly be worse if not careful. Perhaps throttling the update frequency would be good enough, but I would like to have opinions from privacy/security experts here.

469639 referenced within is non-public, so I can only guess

crbug.com/469639 is a launch tracking bug for the AudioWorklet. I am not sure why it's restricted. I'll talk to the team to unlock it.

JohnWeisz commented 6 years ago

Except that it does expose something new! If the battery status API can be used as a vector for side-channel attack, the renderPerformance API can certainly be worse if not careful.

I believe you misunderstood my reply there (that, or I'm missing something). What I'm saying is that if the AudioWorklet could be used to accurately benchmark a real-time thread (it cannot be, according to your reply), then renderPerformance wouldn't expose anything new in practice (especially if you can do something like deliberately glitching out an audio worklet processor and detecting the glitch, IIRC this is not possible with the ScriptProcessorNode).

As things stand, renderPerformance does expose a new surface for fingerprinting, at least without severe adjustments to its return value.

Perhaps throttling the update frequency would be good enough, but I would like to have opinions from privacy/security experts here.

I'm no expert on privacy-related questions of this importance, but wouldn't this just make building a fingerprint take longer? Assuming the throttle rate is known (which it is in practice, if it's constant), just wait a bit between adding new nodes to the graph.

hoch commented 6 years ago

IIUC, fingerprinting or side-channel attack does not require the real-time thread. Certainly you can extract more info if the thread has the highest priority in the system, though. Fingerprinting is actually less of an issue here since Web Audio API has been an abundant source of fingerprinting since its beginning.

Wouldn't this just make building a fingerprint take longer?

Sorry, I was not clear. Throttling is a mitigation for the high precision timer + side-channel attack.

I guess we shared a fair amount of speculation so far, so I'll wait for the verdict from the security folks.

hoch commented 6 years ago

Another reference: Chrome extension API for process inspection.

JohnWeisz commented 6 years ago

Was there any progress on this one recently?

hoch commented 6 years ago

We have not gotten any feedback from the privacy folks. Not sure if we can make a meaningful progress without one.

padenot commented 4 years ago

Virtual F2F:

being able to use performance.now() (https://github.com/WebAudio/web-audio-api-v2/issues/77) from inside the worklet can be an answer to this, but is also useful in general, it will allow understanding the load of the audio callback, when using AudioWorklet

hoch commented 3 years ago

Two items in one package: 1) exposing performance.now() in AWGS and 2) implementing AudioContext.renderCapacity.

Referenece: https://www.w3.org/TR/hr-time-2/#extensions-to-windoworworkerglobalscope-mixin

Note: Exposing HR timestamp on a RT thread might have some FP/security implications.

rtoy commented 3 years ago

Teleconf: Some additional thoughts.

First, it's useful to have both the "instantaneous" capacity and a smoothed capacity for longer-term trends. We should provide both. Probably want to provide a smoothing constant so the developer can choose how quickly or slowly the smoothed capacity responds to changes. Probably also want a worst case value over the last N sec or other interval.

Second, instead of forcing the developer to poll constantly for the capacity, we could provide an event if the capacity exceeds some user-specified threshold. To be effective, we'd also need a lower capacity threshold such that an event is also delivered when the capacity goes below this threshold. This provides some hysteresis so we're not constantly sending events.

hoch commented 3 years ago

Smoothing can be done by user code.
Once we implement 1, the renderer needs to calculate this value every RQ and it needs to update a read-only JS number every RQ. Not really different from AudioContext.currentTime. I am not sure if the Event + Hysteresis approach is really beneficial.

As a reference, I found this: https://developer.chrome.com/docs/extensions/reference/system_cpu/

hoch commented 3 years ago

Discussion from 1/7/2020:

We need smoothing because instantaneous reporting can miss critical information due to the sampling interval.
We also want to have a "peak" information that can be 1) manually reset or 2) automatically deregistered after N seconds.
We can't expose the value as-is, and it should be quantized into several buckets.

hoch commented 3 years ago

The important question at this point: a bucketed category or a quantized number (percentage)?

On both cases, smoothing doesn't really make sense because the data won't have the precision anyway. Also the bucketed category approach works nicely with the event-driven model:

context.renderPerformance.addEventListener('oncapacitychange', (event) => {
  switch (context.renderPerformance.capacity) {
    case 'high':
      break;
    case 'veryhigh':
      break;
    default:
  }
});

rtoy commented 3 years ago

I prefer numbers. Don't understand what you mean by smoothing not making sense. You smooth/filter the raw usage numbers from each render. Then convert to a quantized value to expose to the user. I think numbers work equally well with an event too. Although that still needs to be designed.

hoch commented 3 years ago

You smooth/filter the raw usage numbers from each render. Then convert to a quantized value to expose to the user.

Sure, but that's happening internally so developers don't care about the smoothing. I don't believe the spec needs to discuss how it is being smoothed. Not sure if I want to come up with a detailed model of smoothing and expose all the parameters to the API.

Also do we care about when your CPU usage is 10~20%? It seems very unlikely, and that's one reason why I prefer the bucketed approach.

This is hard to tell without seeing potential use cases.

rtoy commented 3 years ago

It's hard to know now how much smoothing to apply. Someone might want something smoothed over tens of seconds. Some one might want something more immediate like .5 sec. If smoothing is required, we have to specify it so all browsers do the same thing.

Numbers are easier to understand. How close to the limit is "veryhigh"? As as part of the spec, this needs to be described anyway, so just returning a number is nice.

hoch commented 3 years ago

Based on my experience with DAWs, the update frequency and smoothing scheme rarely matter.

The faster update will not protect you from glitching or overloading because you always get a notification or a status after the fact. The goal of this API is not to predict the glitch. It is to provide a way to recover from the overloaded state. If your app starts glitching, then it doesn't really matter if you can react in 1 second or 10ms. That's why I think the reasonable default makes sense for this API.

Defining the range of each category can be done and the enum is nicer because it abstracts the details away. So it is a bit safer in terms of the fingerprinting.

Exposing numbers is useful when you want to visualize them like DAW's CPU meter. But it only makes sense if you expose it as a non-quantized form. It also becomes more vulnerable to FP.

Lastly, every browser has a different rendering and threading model. Even a graph traversal and cross-thread synchronization can be different. It is not guaranteed that we get a same perf indicator (number or category) from a same graph no matter how it is specced.

rtoy commented 3 years ago

If you have 6 enums, it's just as fingerprintable as 0, 20, 40, 60, 80, 100. It doesn't abstract anything away except giving them names. And what happens if it's decided that it's too coarse? You have to create new enums with potentially weird names. "veryhigh" and "extremelyhigh" and "stupendouslyhigh".

No one is saying that you get the same value for the same graph on different browsers. We want to say the values are smoothed in the same way.

And we still want the average and peak values. You have to define what the average is. You may also want the peak value to be smoothed over time too. You need to decay the peak value in some way too. This needs to be specified.

I'm not familiar with any DAW cpu meters, but I assume the display is already quantized, just like the old EQ meters with a 5-10 bars for the level.

hoch commented 3 years ago

You have to create new enums with potentially weird names. "veryhigh" and "extremelyhigh" and "stupendouslyhigh".

I am faithful that this WG will not settle on those weird names. But I feel awkward that enum being numbers: "An enumeration is a definition (matching Enum) used to declare a type whose valid values are a set of predefined strings." (link)

Take sample rate conversion for an example. You'd want to control the conversion quality with categories like "low", "balanced", "high". I don't see them weird or inadequate even if those adjectives are vague and abstract.

We want to say the values are smoothed in the same way.

I am not against smoothing. It is essential, but I am skeptical about the value of detailing its algorithm in the spec language.

I'm not familiar with any DAW cpu meters, but I assume the display is already quantized, just like the old EQ meters with a 5-10 bars for the level.

Since native apps have full access to the information, they are usually precise, detailed and frequent. 6 enum will never be enough for this. That is why I am cautious about the numerical approach. The quantized or discrete numbers won't allow developers to build a DAW-style CPU gauge. (a few image search will give you a ton of examples)

brutzman commented 3 years ago

Remark: this is a worthy challenge to consider. Similar computer science challenge: Sally Floyd correctly identified in the 90s that "congestion control" wasn't the right problem, rather "congestion avoidance" was the appropriate technical motivation to deal with problematic processing (routing) overloads. https://en.wikipedia.org/wiki/Network_congestion

rtoy commented 3 years ago

Oh, I wasn't proposing the enums being numbers. I was proposing just returning numbers. Could be 0-100 or 0-1 in some quantized form, possibly not even linear. (But non-linear makes it hard to decide where the values should be.)

I think smoothing needs to be specified for interoperability. If browsers do it differently, it makes it difficult for developers to know what action to take if browsers smooth things differently. The values don't have to be consistent across browsers, but the behavior should be as much as possible. Ideally, if browser A says "veryhigh" and adding a compressor causes glitching, it would be nice if browser B behaves the same. Roughly. It's pretty hard to make this really true, but we can try to make it more interoperable.

Sample rate conversion quality has been proposed but not spec'ed out, so we don't know how that will end up. Sample rate conversion "quality" is quite a bit harder to quantify and specify. How do you specify the trade-off between speed and accuracy?

I see that DAW CPU graphs show pretty detailed curves. Obviously, we can't do that. At least not to that detail.

hoch commented 3 years ago

Could be 0-100 or 0-1 in some quantized form, possibly not even linear.

Then there's no difference from using enum.

Ideally, if browser A says "veryhigh" and adding a compressor causes glitching, it would be nice if browser B behaves the same. Roughly. It's pretty hard to make this really true, but we can try to make it more interoperable.

Yes. That's exactly the point I have been making. Providing more precision and granularity won't be helpful, so that should not be the first goal of this work.

Sample rate conversion quality has been proposed but not spec'ed out, so we don't know how that will end up.

I only mentioned that as an example of using enums. We won't be using "stupendouslyhigh" for the sample rate conversion quality, and that won't be happening here as well.

amalamos commented 3 years ago

Sorry if my notice look abit naive but I saw in MDN webaudio API that AudioWorkletGlobalScope supports currentFrame, currentTime and samplerate. Are they impemented? If they are, why dont you use them to decide whether a process is overloaded or not. By a couple of consequent frames and times you may calculate achieved sample rate, Then you may compare achieved with the sampleRate from the API (which is from (BaseAudioContext) and you decide about overloading.

WebAudio / web-audio-api

API proposal to preemptively determine audio thread overload (Render Capacity) #2444