WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 166 forks source link

Support "interrupted" state in AudioContext #2392

Open rtoy opened 3 years ago

rtoy commented 3 years ago

Re-opening this issue. suspend() and resume() are indeed important parts of this proposal, but the "interrupted" state is important as well. During a phone call, (or other period of exclusive use of the audio hardware) requests to resume() the AudioContext will always fail. An "interrupted" state signals to the web page that, as opposed to merely "suspended", all requests to resume() will reject.

Additionally, "interrupted" is a state provided by the UA, rather than script running in the page. A UA could move the state from "running" to "interrupted" to "running" again (if the interruption allows normal audio playback to resume afterwards), or from "running" to "interrupted" to "suspended" (if audio playback is not allowed to resume automatically). In the first case, the page is free to react to the "interrupted" state by pausing (e.g.) the video game, and un-pause when the state moves to "running" after the interruption ends. In the second case, when the state moves to "suspended", the page can provide UI to restart playback (and the game) after the interruption. In either case, the page can choose to continue silently during the interruption.

"closed" is not a substitute here, as there is no mechanism for moving from "closed" to "running". "suspended" is not a substitute because there's no indication to the page why the state moved to "suspended" (did the user pause audio through UA provided hardware controls?) nor is there an indication when playback would be allowed again. The promise returned by "resume" is not a substitute, because if not immediately rejected, it would resolve at the exact same time as the UA would automatically move the state from "interrupted" to "running" in the first scenario, and would reject in the second without indicating to the page that, if they just tried again, things would work fine.

Originally posted by @jernoble in https://github.com/WebAudio/web-audio-api/issues/72#issuecomment-717368157

rtoy commented 3 years ago

See also the followup comments:

Summary: General agreement that we need this, but there are quite a few questions on what the state transitions are and what causes the transitions. These need to be finalized before we can update the spec.

jernoble commented 3 years ago

@jernoble Can you clarify the state transitions here?

Assuming we're running, the only transitions are running -> interrupted -> running if allowed. And running -> interrupted -> suspended if the interruption is over, but we're not allowed to resume playing?

That's correct.

Calling suspend or resume in the interrupted state causes these promises to be rejected?

resume(), would reject, but suspend() would (all other things being equal) would resolve. There's some wiggle room here; suspend() could also just not resolve until the interruption was over, and then move the state to "suspended", regardless of whether the system allowed playback to resume or not. Or suspend() could resolve immediately and move the state to "suspended", and resume() could block until the interruption was over. I think there's a valid developer usability question here.

So once we're in the interrupted state, there's no way for the user to get out of that except wait for the OS to finish the interruption?

Please keep in mind that it's actually the user that's responsible for the interruption. The user has received a phone call and the user is talking on the phone. The user has switched to another app and the user is using the audio hardware there. The user is recording a video and saving it to the user's photo library.

So to put it another way, the page must wait until the user allows the page to resume audio, mediated by the User Agent.

Presumably, we can still call close to close the context in the interrupted state?

Yes, and this would cause the state to move to "closed".

What happens if we're suspended? Can we transition to interrupted by the OS? Seems like we might want to do that to match what running->interrupted does. I guess that means when the interruption ends, we go back to suspended instead of resuming?

Or maybe we don't go to interrupted state. When the user resumes(), we immediately go to interrupted if the OS says we were interrupted? Then the rest of the behavior is defined by the running->interrupted transitions above.

I'd hate to freely expose to every web page the current state of whether the user is on the phone or not, so I'd argue to only move to the "interrupted" state from "running".

guest271314 commented 3 years ago

Please keep in mind that it's actually the user that's responsible for the interruption.

I do not gather how "interrupted" state adds functionality.

The user has received a phone call and the user is talking on the phone. The user has switched to another app and the user is using the audio hardware there. The user is recording a video and saving it to the user's photo library.

GainNode can be used to reduce volume; MediaStreamTrack enabled can be set to false; FN+F11 to pause the current tab audio output while fielding the telephone call on a different tab; Chromium has implemented "Global Media Controls", which provides a means to pause specific tab audio output (except for window.speechSynthesis.speak() output, which is not output at "Playback" or "AudioStream" names playback devices of Chrome and Firefox, respectively) set the volume of a specific device directly using pactl.

guest271314 commented 3 years ago

For the use case, which is evidently for mobile, or "smart" phone alone

The user has received a phone call and the user is talking on the phone. The user has switched to another app and the user is using the audio hardware there. The user is recording a video and saving it to the user's photo library.

what really appears to be needed is for the ability to mute all current and future AudioContext instances in a given tab, or just to mute the tab entirely, including HTMLMediaElement playback, Web Speech API, similar to https://www.blog.google/products/chrome/manage-audio-and-video-in-chrome/.

Given application could listen for one AudioContent "interrurpted" state then start a different AudioContext in the same tab, thwarting the use case of making a call on a different tab.

jernoble commented 3 years ago

GainNode can be used to reduce volume; MediaStreamTrack enabled can be set to false; FN+F11 to pause the current tab audio output while fielding the telephone call on a different tab; Chromium has implemented "Global Media Controls", which provides a means to pause specific tab audio output (except for window.speechSynthesis.speak() output, which is not output at "Playback" or "AudioStream" names playback devices of Chrome and Firefox, respectively) set the volume of a specific device directly using pactl.

You seem to be suggesting that sites which care could implement some kind of "mute" behavior themselves. That's not the intent of this API; it's not the expectation that the User Agent will indicate to the page that the user is doing something else and the page should kindly mute itself. Rather, this is telling the page that audio playback will not continue, and it can take some action in response, like pausing the game being played, suspending an animation, etc.

Given application could listen for one AudioContent "interrurpted" state then start a different AudioContext in the same tab, thwarting the use case of making a call on a different tab.

On the platforms I am personally familiar with, this will not succeed. That new context will also enter the "interrupted" state, as that second context cannot get access to the audio hardware.

guest271314 commented 3 years ago

You seem to be suggesting that sites which care could implement some kind of "mute" behavior themselves.

No. I am stating that I do not trust the term of art "User Agent", which suggests some algorithm written by humans will dictate when the audio from a specific application is muted or not.

The user should define that, not the "User Agent".

Since we are discussing output, or playback, AudioOutputContext describes what you describes, yet taking the process a step further to provide an API for users to absolutely control which playback or output streams do whatever they do.

Rather, this is telling the page that audio playback will not continue, and it can take some action in response, like pausing the game being played, suspending an animation, etc.

You cannot guantee that, as I indicated for the Web Speech API trivial exempt example.

On the platforms I am personally familiar with, this will not succeed. That new context will also enter the "interrupted" state, as that second context cannot get access to the audio hardware.

I can work around whatever you think you might be specifying. Just provide the complete API: user controlled capability to mute any and all audio output, which is possible using pavucontrol GUI or pactl and pacmd on Linux.

guest271314 commented 3 years ago

There is no telling what kind of audio configuration I might cobble together for testing or experimentation. I do not want a "User Agent" (arbitrary implementation) interpreting a "will not continue" signal. I may have several API's interacting and do not want the "User Agent" to do anything but play the audio, or whatever I tell it to do, including moving streams between devices https://github.com/guest271314/setUserMediaAudioSource, not be set into an irretreivable state based on some gaming technology on a handheld requirement. Just list all streams, in a GUI if nececessary, then the user can mute, increase or decrease the volume, change stream sources and outputs, et al., without any algorithm language necessary besides access to the devices via OS or third-party modules capable of achieving that goal, e.g., https://github.com/Siot/PaWebControl.

"interrupt" does not go far enough. Just expose what is actually occuring at the OS level and skip the middle "interrupt" signal. Either mute, unmute, increase volume, decrease volume, move streams, create virtual streams, etc. - directly by user action, not "a site" - the "User Agent" just implements the API and consistent GUI for total control of audio, instead of piecemeal, "smart"-phone, handheld, battery usage-centric API's, specify OS-comparable control in the browser for audio playback, recording, input, output, virtual configurations, and be done with the matter.

rtoy commented 3 years ago

Teleconf: set priority to 2 for now.

rtoy commented 3 years ago

Virtual F2F 2021: We agree that having an "interrupted" state makes sense. However, we need more details on exactly how this is intended to work to be able to specify this properply.

Christoph Guttandin mentioned that Safari sets the interrupted state if an audiocontext is silent and you switch tabs. When you switch back, it resumes. But if the context is playing, this doesn't happen. This all needs to be clarified.

hoch commented 1 year ago

2023 TPAC Audio WG Discussion: The group has not reached a conclusion yet, but we will check what WebKit does for the interrupted status and consider finding a common ground with the AudioSession API proposal.

chrisguttandin commented 12 months ago

I build a very simple demo which plays a sine wave and logs the state of the AudioContext. It also allows to select the AudioSessionType (if supported) to check if that makes any difference.

https://stackblitz.com/edit/js-3w4eqh

As far as I can tell the audio stops in any case on iOS 16.6 when the screen gets switched off. The state of the AudioContext will then be 'interrupted'.

However there are subtle differences. When the type of the AudioSession is 'auto' or 'playback' the AudioContext gets resumed automatically when the screen is turned back on. That doesn't seem to happen when the type of the AudioSession is 'transient'. In that case the state of the AudioContext is 'suspended'.

@jyavenard Is that expected or should I file a bug for that?

jyavenard commented 12 months ago

It sounds like https://bugs.webkit.org/show_bug.cgi?id=261554

and yes, there’s an issue as the behaviour isn’t consistent between macOS and iOS.

But only playback state will allow audio to be played while the page is suspended. Similar to how an audio/video element work

guest271314 commented 12 months ago

For clarity, this issue is about behaviour of AudioContext on mobile devices and is focused on Apple (Webkit) devices?

gabrielsanbrito commented 3 weeks ago

Hello! I have written an explainer to make progress on the "interrupted" state proposal. I tried to incorporate as much as possible this issue's discussion into the document. Moreover, I am aware that WebKit has this implemented, but I am not sure to what degree the proposal is interoperable with it. I appreciate any feedback you may have. Thanks!