Closed JohnWeisz closed 2 years ago
The currentFrame and currentTime reflect the audio thread's internal count of frames. This is not directly correlated to the realtime. So, if something delayed processing for, say, 1 sec, currentFrame will still be incremented by just 128 for the next frame, even though 1 sec of real time has elapsed. If pefromance.now were available, you could then keep track. It's not currently available, but under consideration in WebAudio/web-audio-api#2413.
Plus, we want to be able to report this without having developers use an AudioWorkletNode if they otherwise wouldn't need one.
Hi,
Just following this thread. I just discovered this render capacity is a fairly good metric for audio performance and I was wondering if there is anyway to log or record this statistic?
@jacksongoode
For Chrome: https://web.dev/profiling-web-audio-apps-in-chrome/
For Firefox: https://blog.paul.cx/post/profiling-firefox-real-time-media-workloads/
@jacksongoode
For Chrome: https://web.dev/profiling-web-audio-apps-in-chrome/
For Firefox: https://blog.paul.cx/post/profiling-firefox-real-time-media-workloads/
Hmm, I've checked those out but it doesn't look like there a way to get logs of the render capacity as a percent of CPU use, is there?
I was also wondering, seeing as this follows this thread's theme, if this method of calculating CPU can be corrected for the web audio demo?
Sorry that I misunderstood your question. The idea of render capacity was originated from here: https://web.dev/profiling-web-audio-apps-in-chrome/#use-the-webaudio-tab-in-chrome-devtools (look for "Render Capacity" in the description)
Currently there's no way to use this value programmatically - and this thread is about exposing it on the platform.
I was also wondering, seeing as this follows this thread's theme, if this method of calculating CPU can be corrected for the web audio demo?
Are you asking if this API can be used for CPU benchmark? There might be correlation, but I believe the render capacity is much narrower than the generic CPU info APIs. It only cares about your audio rendering thread. (at least that's my intention so far)
Also the example from the Resonance audio uses OfflineAudioContext. The render capacity is specifically for the real-time use case.
https://github.com/oyiptong/compute-pressure/blob/main/README.md https://oyiptong.github.io/compute-pressure/
is a proposal that is related, not on any standards track, that does not replace what we're doing here, that is very much needed. It thought however that it would be useful to mention it and maybe also use it in conjunction with the API being designed here. Machine load can certainly be interesting for real-time audio apps, even if quite frequently this is using real-time threads, that are going to be scheduled ahead of others.
Glad that you brought up ComputePressure API - I vaguely remember I mentioned it in the teleconference in the past. I see a couple of major differences, but the biggest one is:
CP API is for measuring the "overall" machine load over a fixed (and relatively coarse, 1 sec) sampling interval. On the other hand, the Render Capacity specifically focuses on the workload in the audio rendering thread. One can be used as a rough proxy for another, but they do not exactly match. So yes, I agree that the Render Capacity serves its own purpose.
I can see developers take advantage of both APIs: ComputePressure for the main thread load, and Render Capacity for the audio rendering load.
Current WIP proposal for F2F discussion:
dictionary AudioRenderPerformanceOptions {
double measuringInterval = 1;
double smoothingCoefficient = 0.5;
}
[Exposed=Window]
interface AudioRenderPerformance {
undefined start(AudioRenderPerformanceOptions);
undefined stop();
readonly attribute double capacity;
}
partial interface AudioContext {
[SecureContext] readonly attribute AudioRenderPerformance renderPerformance;
}
context.renderPerformance.start();
const pollCapacity = (timestamp) => {
const cap = context.renderPerformance.capacity;
doSomethingWithRenderCapacity(cap, timestamp);
};
requestAnimationFrame(pollCapacity);
dictionary AudioRenderPerformanceOptions {
double openThreshold = 0.95;
double closeThreshold = 0.75;
}
[Exposed=Window]
interface AudioRenderPerformanceEvent : Event {
constructor (DOMString type, boolean aboveThreshold);
readonly attribute boolean aboveThreshold;
}
[Exposed=Window]
interface AudioRenderPerformance {
undefined start(AudioRenderPerformanceOptions);
undefined stop();
attribute EventHandler onthresholdcrossing;
}
partial interface AudioContext {
[SecureContext] readonly attribute AudioRenderPerformance renderPerformance;
}
context.renderPerformance.addEventListener('thresholdcrossing', (event) => {
if (event.aboveThreshold) {
reduceWork();
} else {
addMoreWork();
}
});
context.renderPerformance.start({openThreshold: 0.9, closeThreshold: 0.8});
F2F: The majority was in favor of the polling approach, but with more meaningful properties.
dictionary AudioRenderPerformanceOptions {
double measuringInterval; // in seconds
double smoothingCoefficient;
}
[Exposed=Window]
interface AudioRenderPerformance {
undefined start(AudioRenderPerformanceOptions);
undefined stop();
readonly attribute double averageCapacity;
readonly attribute double maxCapacity;
readonly attribute double underruns;
}
partial interface AudioContext {
[SecureContext] readonly attribute AudioRenderPerformance renderPerformance;
}
averageCapacity
indicates the average capacity over a sampling period.maxCapacity
indicates the max capacity over a sampling period.underruns
is a percentage of render quanta with buffer underruns over a sampling period.double
, but it is quantized to 100 steps from 0 to 1.Also note that renderPerformance
object is protected by SecureContext
and without user gesture (autoplay) these values will be all zeros.
This Compute Pressure API proposal seems related (see TAG Review)
Updated the proposal above to reflect the discussion on 6/3.
Teleconf: @hoch mentioned that the API may change a bit due to input from Google's privacy team. Instead of polling, it may be event-based like rAF. Details still need to be worked out.
Here's the event-based API proposal:
dictionary AudioRenderPerformanceOptions {
double updateInterval = 1;
}
[Exposed=Window]
interface AudioRenderPerformanceEvent : Event {
constructor (DOMString type, double timestamp,
double averageCapacity, double maxCapacity, double underrunRatio);
readonly attribute double timestamp;
readonly attribute double averageCapacity;
readonly attribute double maxCapacity;
readonly attribute double underrunRatio;
}
[Exposed=Window]
interface AudioRenderPerformance {
undefined start(AudioRenderPerformanceOptions);
undefined stop();
attribute EventHandler onupdate;
}
partial interface AudioContext {
[SecureContext] readonly attribute AudioRenderPerformance renderPerformance;
}
AudoioContext’s playback is controlled by the autoplay policy and cannot be automatically played (that is, suspended
) without an user interaction. (e.g. explicit clicking on DOM)
@pmlt @jackschaedler @andrewmacp What do you think about the latest API proposal? Any feedback would be greatly appreciated!
This feature isn't necessary for game development, however it seems to me that if I was writing a web-based DAW this API would give me a sufficient approximation to have a CPU usage meter. Looks good to me!
I thought the game audio engine would want to monitor the current render capacity, so it can dynamically control the application load. Still, thanks for the feedback!
We definitely do, but since we use a WebWorker/SAB/AudioWorklet architecture, we immediately detect underruns at the start of the AudioWorklet callback simply by inspecting the number of audio frames produced by the WebWorker since the last callback. No need for an event-based approach in this case.
Ah, I see. That's actually a good point. I understand that you can implement the glitch detection within AudioWorkletProcessor - but would it be useful to have a built-in detection/monitoring feature on AWGS?
@hoch perhaps you can clarify a detail here.
Given
AudioRenderPerformanceOptions
- updateInterval:
For example, 375 values will be collected with 1 second interval at 48Khz sample rate and 128 frames of render quantum. AudioRenderPerformanceEvent
- Note that For averageCapacity, maxCapacity, and underruns, the range is between 0.0 and 1.0 and its precision is limited to 1/100th. (21 bits of entropy total)
What is the rounding strategy?
Consider using the default parameters, as you outline that'd be 375 events per second. If there is just a single underrun the ratio would be 1/375. What would the resulting reported ratio float be? I would assume 0, ie that single underrun would go undetected.
Assuming then small ratios get rounded to zero, the counter to that would be to calculate update interval given samplerate (and, soon the render quantum size).
As an alternative to this, why not just report the total number of frames with underruns as well as the total number of frames for the given interval?
EDIT: rewrote a sentence
What would the resulting reported ratio float be? I would assume 0, ie that single underrun would go undetected.
Good point! We need to develop a rounding strategy to avoid the situation. A clear distinction between zero and non-zero would solve this issue. For example, 0 would be 0, but 1/375 would be 0.01. Some details need to be fleshed out, so thanks for raising this question!
As an alternative to this, why not just report the total number of frames with underruns as well as the total number of frames for the given interval?
The bucketed approach helps us lower the privacy entropy and reporting the exact number of underruns is something that we want to avoid. My goal is to maintain the privacy entropy as low as possible while keeping it useful to developers.
as well as the total number of frames for the given interval?
Unfortunately this is sort of already exposed via AudioContext.baseLatency. Since this number is very platform specific, so it adds more bits to the fingerprinting information. I don't think we want to duplicate the info through this new feature.
@hoch
This looks pretty good to me!
Like @ulph, I was also wondering about the way that the underruns
value is reported... It would be great to avoid a situation where underruns
is reporting 0.0
but maxCapacity
is reporting 1.0
. That seems like the only case where using this API would be a bit confusing.
This makes me wonder if there's simply a better term than underruns
for this concept. Maybe if that field was named something like underrunRatio
or capacityExceededRatio
or something like that, the 0-1 value range and precision choices would feel more natural.
A few nitpicks if this exact language ends up being used:
unrderruns
is spelled incorrectly in Event constructor argument.event wil still
-> event will still
.Fixed nits. Thanks!
underrunRatio is (where N is the number of render quanta per interval period, u is the number of underruns per interval period):
I will think about if we need the same treatment for the average and max capacity, and feel free to chime in if you have any suggestions.
@hoch This looks great!
Just out of curiosity is the lower bound on the updateInterval 1s or can it be set even lower?
@andrewmacp 1 is a placeholder value, but I think it's a reasonable default. How much lower do we want to go? and why?
Another example nearby would be ComputePressure API and it also has a rate-limited event:
"The callback is intentionally rate-limited, with the current implementation maxing out at one invocation per second for a page in the foreground, and once per ten seconds for background pages."
@hoch I was mostly just curious here, I think the underrunRatio should give us enough of what we need but was wondering if we can simply set the updateInterval to a value that would result in <=100 renders per callback in order to get a specific number of underruns. But then that would defeat the purpose of using a ratio to begin with so was wondering if the plan was to limit the updateInterval to a 1s minimum like in the ComputePressure API.
@hoch noted, your suggested rounding scheme would work.
- 0.01 if u / N value is greater than 0.0 and less than equal to 0.01.
- Otherwise it's u / N rounded up to the nearest 100th.
doesn't hurt to extra clear I suppose - but the "ceil to hundredth" would cover the 0 < x < 0.01 case as well?
I will think about if we need the same treatment for the average and max capacity, and feel free to chime in if you have any suggestions.
I don't think the rounding is as problematic for the capacity ones - the issue with the underrun ratio was that (repeated) single underruns can be quite detrimental, esp. consider the case of occasional single glitches occurring every few seconds.
Some ideas for capacity though; does it make sense to throw some more statistical numbers in there? Like min and stdev? That would give some more indication of distribution of the capacity measurements during the measurement window. (Personally I would have pondered a histogram but I suspect that's not everyones cup of tea)
Re: @andrewmacp
But then that would defeat the purpose of using a ratio to begin with so was wondering if the plan was to limit the updateInterval to a 1s minimum like in the ComputePressure API.
I believe 1s is sensible default. I don't have a strong opinion on its lower boundary as long as it's reasonably coarse. (e.g. 0.5 second or higher) This is up for discussion.
Re: @ulph
"ceil to hundredth" would cover the 0 < x < 0.01 case as well?
Thanks! That's better.
Some ideas for capacity though; does it make sense to throw some more statistical numbers in there? Like min and stdev? That would give some more indication of distribution of the capacity measurements during the measurement window. (Personally I would have pondered a histogram but I suspect that's not everyones cup of tea)
IMO those (min, stddev, histogram) are in the scope of "good-to-have". I suggest we deliver the first cut with the most essential data, and extend it later when it's absolutely needed.
The actual problem shouldn't be the CPU overload, it should be processing not completing within the cycle time - no matter what the reason is, such overruns are a problem. In the WebRTC context, we typically solved this with counters - a counter of how many samples have been processed and of how many samples did not finish processing in its cycle should be sufficient; if the last one stirs above zero, you know you're in trouble.
it should be processing not completing within the cycle time
Yes. This is basically how the "averageCapacity" is calculated. This capacity might be related with the CPU usage to a certain degree, but they clearly have different meanings.
a counter of how many samples have been processed and of how many samples did not finish processing in its cycle should be sufficient
This is something we can already do right now, using audio worklets. The problem is that you only get to know when you're already overloading - there's no way to see how much headroom you still have before overloading would occur. For us this is critical, we want to get a solid understanding of how much headroom for additional processing we have on the clients machines.
I suppose the better way is to measure the raw execution time for the rendering of one audio block. Since the sample rate and block size is known, this should give a direct "load percentage".
I suppose the better way is to measure the raw execution time for the rendering of one audio block. Since the sample rate and block size is known, this should give a direct "load percentage".
Just a note here that, additionally to @hoch proposal here, https://github.com/WebAudio/web-audio-api/issues/2413 can be of interest to you if you want to record time yourself. The current status is that it's going to be added (see my last message there).
The feature in this issue is still nice to have a more global load metric including native nodes, that can be significant in terms of load (HRTF panning, Convolver, but also an accumulation of cheaper node for a complex processing graph, etc.).
During development (meaning that it doesn't eliminate the need for the two things just mentioned), both Firefox and Chrome have profiling capabilities that allow drilling down and getting the execution time of each process
calls, and depending on the browser, other metrics such as a histogram of callback time, etc.
https://web.dev/profiling-web-audio-apps-in-chrome/ and https://blog.paul.cx/post/profiling-firefox-real-time-media-workloads/ for Chrome and Firefox, respectively.
The Web Audio API defines audio production apps as a supported use-case, such as wave editors, digital audio workstations, and the like. And it's indeed quite adequate at the task!
One common trait of these applications is the capability of accepting virtually unlimited user content, which will essentially always result in hitting the limit of audio processing capabilities on a given machine at one point -- i.e. there is so many audio content that the audio thread simply cannot keep up with processing everything, it becomes overloaded, no matter how well it is optimized.
I believe this is widely and well understood in the audio production industry, and usually the solution to help the user avoid overloading is displaying a warning indicator of some kind, letting the user know the audio processing thread is about to be overloaded, and audio glitching will occur (so the user can know they should go easy on adding more contents).
Note: in native apps (mostly C++ based), this is most commonly implemented as a CPU load meter (for the audio thread's core), which you can keep your eye on to know how far you are from the limit, roughly.
Currently, the Web Audio API does not expose a comparable API to facilitate monitoring audio processing load, or overload.
It's possible to figure it out, mainly in special cases with above-web-standard privileges (such as an Electron app). However, this is quite difficult to get right (even from native C++ side) without implementations taking a spec-defined standard into consideration.
I'd wish to propose a small set of light, straightforward, low-privacy implication API additions to enable this:
For obvious reasons, this is an extension to the
AudioContext
interface, notBaseAudioContext
, as overload detection is not applicable for OfflineAudioContext processing. Having an event dedicated for the same purpose avoids the need of a polling check.It is up to implementations to decide how exactly it is determined whether the audio thread is considered overloaded or not, optionally taking the AudioContext
latencyHint
setting into consideration.This would enable Web Audio API-based apps let the user know about high audio thread load, and display possible options or hints at steps for the user to take to avoid audio glitching.
Privacy implications
Exposing this information should have little to no privacy implications, as (1) it is rarely clear why exactly the audio thread is overloaded (it could be due to low device capabilities, or high CPU use by other processes), and (2) it does not provide a more accurate way to determine device capabilities than what is already possible with a simple scripted benchmark.