w3c / mediasession

Media Session API
https://w3c.github.io/mediasession/
Other
129 stars 29 forks source link

Figure out the coupling between audio focus/session, audio playback and remote control events #9

Closed foolip closed 9 years ago

foolip commented 9 years ago

This has big implications for the shape of the API.

Android:

iOS:

CC @sicking, @jernoble, @richtr, @marcoscaceres. Anyone else?

foolip commented 9 years ago

@jernoble, you clarified in https://github.com/whatwg/media-keys/issues/1#issuecomment-73090289 and some more pedantic detail on the iOS coupling of these issues would be helpful. Specifically:

  1. Is it possible to activate and deactivate an audio session (thus interrupting other apps) without playing any audio?
  2. When an audio session is interrupted, is any audio automatically paused?
  3. Is ducking automatic and does an app know when its audio is being ducked?
  4. Is an active audio session enough to receive remote control events, or must there also be audio playback?
  5. Does starting audio playback implicitly activate an audio session?
  6. It looks like AVAudioSession is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept?

There are probably more nuances I'm not aware of, basically I'd like to learn which kind of Web platform API would be possible to implement in iOS Safari, assuming no private APIs that other apps don't have.

jernoble commented 9 years ago

@foolip

(It's not clear to me if “Now Playing” means having an active audio session, or also having a playing media player.)

The former. The higher-level APIs like AVPlayer and MPMoviePlayerController will set up and activate an audio session on your behalf.

  1. Is it possible to activate and deactivate an audio session (thus interrupting other apps) without playing any audio?

Yes, modulo priority. A music-playing app will not be allowed to interrupt a phone call, for example.

  1. When an audio session is interrupted, is any audio automatically paused?

If you interact with an audio session through an AVPlayer or MPMoviePlayerController, then yes, they will pause when they receive an audio session interruption notification. If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.

  1. Is ducking automatic...

Yes, because ducking is controlled by the interrupting session. E.g., a navigation app will cause the audio from a music app to be ducked when speaking turn-by-turn directions.

... and does an app know when its audio is being ducked?

No, I don't believe so.

  1. Is an active audio session enough to receive remote control events, or must there also be audio playback?

No, I think an active audio session is enough, though I haven't verified this.

  1. Does starting audio playback implicitly activate an audio session?

See the first comment; only if you use a high-level playback API. If your app plays audio via low-level audio API without activating an audio session manually, not only will that app not get remote control events, no audio will be produced in hardware.

  1. It looks like AVAudioSession is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept?

So this is no longer about iOS in general and instead is about WebKit on iOS specifically: I haven't seen a difficulty in implementing (non-web exposed) remote control events for

foolip commented 9 years ago

When an audio session is interrupted, is any audio automatically paused?

If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.

Does this mean that it's technically possible to continue producing audio even when your session becomes inactive? I assume that there's forced muting for phone calls, but when one music player is interrupted by another? (Assuming they both use the low-level audio APIs.)

Is an active audio session enough to receive remote control events, or must there also be audio playback?

No, I think an active audio session is enough, though I haven't verified this.

This makes me optimistic, do you think a similar model for a Web-exposed API would work, where one can get remote control events by activating a session, which doesn't need to be tied to an HTMLMediaElement or AudioContext internally? (Good defaults for existing uses of HTMLMediaElement are important, of course.)

foolip commented 9 years ago

I neglected lock screen controls in my first comment. Here's what I found:

On Android, it used to be handled by RemoteControlClient, which has been deprecated in favor of MediaSession.

On iOS, it's part of Now Playing Information which appears to be tied to Remote Control Events.

I'm not certain, but it looks like coupling lock screen controls to remove control events is the way to go. As a user, I'd certainly appreciate if any app that's playing while the screen is locked can be stopped from the lock screen.

foolip commented 9 years ago

Some more rambling ideas in https://github.com/whatwg/media-keys/blob/766130421a85e6a101c99e63570a31b5ce300323/MediaSession.md#integration-with-audiocontext-and-htmlmediaelement

@jernoble, have you been able to figure out precisely what the restrictions are on iOS? In my mind, the "An active MediaSession is required in order to start playing audio" way seems somewhat reasonable but it seems to be the reverse of what iOS does.

foolip commented 9 years ago

Closing this now, the spec as it is answers these questions as such: to activate a media session you need to attempt to play a media element, and only an active media session of kind "content" will get audio focus and media key events. (The transient kinds will get transient audio focus but no UI or events.)

doomdavve commented 9 years ago

What we've gathered when testing this on iOS (using AVAudioSessoin and AVAudioPlayer):

The last point may force us to special case meta-data handling if we ever separate media session activation from media playback in a web exposed way.

doomdavve commented 9 years ago

Additional observations: