Closed foolip closed 9 years ago
@jernoble, you clarified in https://github.com/whatwg/media-keys/issues/1#issuecomment-73090289 and some more pedantic detail on the iOS coupling of these issues would be helpful. Specifically:
AVAudioSession
is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept? There are probably more nuances I'm not aware of, basically I'd like to learn which kind of Web platform API would be possible to implement in iOS Safari, assuming no private APIs that other apps don't have.
@foolip
(It's not clear to me if “Now Playing” means having an active audio session, or also having a playing media player.)
The former. The higher-level APIs like AVPlayer and MPMoviePlayerController will set up and activate an audio session on your behalf.
- Is it possible to activate and deactivate an audio session (thus interrupting other apps) without playing any audio?
Yes, modulo priority. A music-playing app will not be allowed to interrupt a phone call, for example.
- When an audio session is interrupted, is any audio automatically paused?
If you interact with an audio session through an AVPlayer or MPMoviePlayerController, then yes, they will pause when they receive an audio session interruption notification. If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.
- Is ducking automatic...
Yes, because ducking is controlled by the interrupting session. E.g., a navigation app will cause the audio from a music app to be ducked when speaking turn-by-turn directions.
... and does an app know when its audio is being ducked?
No, I don't believe so.
- Is an active audio session enough to receive remote control events, or must there also be audio playback?
No, I think an active audio session is enough, though I haven't verified this.
- Does starting audio playback implicitly activate an audio session?
See the first comment; only if you use a high-level playback API. If your app plays audio via low-level audio API without activating an audio session manually, not only will that app not get remote control events, no audio will be produced in hardware.
- It looks like AVAudioSession is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept?
So this is no longer about iOS in general and instead is about WebKit on iOS specifically: I haven't seen a difficulty in implementing (non-web exposed) remote control events for
When an audio session is interrupted, is any audio automatically paused?
If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.
Does this mean that it's technically possible to continue producing audio even when your session becomes inactive? I assume that there's forced muting for phone calls, but when one music player is interrupted by another? (Assuming they both use the low-level audio APIs.)
Is an active audio session enough to receive remote control events, or must there also be audio playback?
No, I think an active audio session is enough, though I haven't verified this.
This makes me optimistic, do you think a similar model for a Web-exposed API would work, where one can get remote control events by activating a session, which doesn't need to be tied to an HTMLMediaElement
or AudioContext
internally? (Good defaults for existing uses of HTMLMediaElement
are important, of course.)
I neglected lock screen controls in my first comment. Here's what I found:
On Android, it used to be handled by RemoteControlClient
, which has been deprecated in favor of MediaSession
.
On iOS, it's part of Now Playing Information which appears to be tied to Remote Control Events.
I'm not certain, but it looks like coupling lock screen controls to remove control events is the way to go. As a user, I'd certainly appreciate if any app that's playing while the screen is locked can be stopped from the lock screen.
Some more rambling ideas in https://github.com/whatwg/media-keys/blob/766130421a85e6a101c99e63570a31b5ce300323/MediaSession.md#integration-with-audiocontext-and-htmlmediaelement
@jernoble, have you been able to figure out precisely what the restrictions are on iOS? In my mind, the "An active MediaSession is required in order to start playing audio" way seems somewhat reasonable but it seems to be the reverse of what iOS does.
Closing this now, the spec as it is answers these questions as such: to activate a media session you need to attempt to play a media element, and only an active media session of kind "content" will get audio focus and media key events. (The transient kinds will get transient audio focus but no UI or events.)
What we've gathered when testing this on iOS (using AVAudioSessoin and AVAudioPlayer):
The last point may force us to special case meta-data handling if we ever separate media session activation from media playback in a web exposed way.
Additional observations:
This has big implications for the shape of the API.
Android:
registerMediaButtonEventReceiver
and a newerMediaSession
API allow apps to handle media buttons. Both appear to be orthogonal to audio focus and audio playback.iOS:
CC @sicking, @jernoble, @richtr, @marcoscaceres. Anyone else?