w3c / mediasession

Media Session API
https://w3c.github.io/mediasession/
Other
129 stars 29 forks source link

What is the relationship between toggle microphone/camera actions and MediaStreamTrack mute/unmute events? #307

Closed youennf closed 9 months ago

youennf commented 12 months ago

When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events. It would help to define whether the mute/unmute events are fired before or after the corresponding MediaSession action callback.

youennf commented 12 months ago

This came up as part of https://github.com/w3c/mediacapture-extensions/issues/39.

youennf commented 12 months ago

To me, it makes more sense that the action callback is executed first and the mute/unmute events second. If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.

Maybe it would help for MediaSessionActionDetails to contain more information, something like:

The first bullet is also related to the question we have about how setMicrophoneActive and setCameraActive are expected to be used.

jan-ivar commented 10 months ago

There is currently no relationship AFAIK.

When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events.

I don't think this describes any browser today.

I've confirmed this with this fiddle which reacts to the mic mutecam mute buttons in Picture-in-Picture mode (which end-users might be surprised to learn are chrome buttons).

Tracks are never muted in Chrome (easily observed by commenting out the JS that clears audioTrack.enabled and videoTrack.enabled). IOW, these chrome buttons are 100% webpage-controlled.

Chrome is the only browser to implement togglemicrophone and togglecamera, so I was unable to test Safari or Firefox.

jan-ivar commented 10 months ago

That said, a UA COULD mute tracks here (since it can mute at any time), as I mention in https://github.com/w3c/mediasession/issues/279#issuecomment-1846023701.

The first decision such a UA would need to make would be whether to trust the webpage to maintain the mic muted cam muted states, and simply enforce them with mute/unmute.

Not trusting invites the double-mute problem.

Trust doesn't seem like a big deal in the PiP example, but that might be deceiving, as many end-users might not even consider them chrome buttons in the first place.

If we instead consider Safari's pause feature in the URL bar:

image

...then it seems less obvious that the webpage should control it. E.g. Safari might wish to A) keep it as a double-mute, or B) use heuristics on unmute based on how the user muted (from the web page or the chrome).

To that end, it might be useful to fail the setters and return a promise to allow prompting. E.g. (proposal):

  // API modification proposal
  try {
    await navigator.mediaSession.setMicrophoneActive(true);
    // unmute succeeded
  } catch (e) {
    if (e.name != "NotAllowedError") throw;
    // unmute denied
  }

UAs might also take transient activation into account.

jan-ivar commented 10 months ago

To me, it makes more sense that the action callback is executed first and the mute/unmute events second. If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.

That seems the most deterministic, since mute/unmute may fire for other reasons.

Maybe it would help for MediaSessionActionDetails to contain ... something like: ...whether the action is about muting or unmuting

This might be helpful for a webpage that has gotten out of sync, but also seems redundant until I understand how they can get out of sync.

youennf commented 10 months ago

@jan-ivar, I think we are mostly aligned, basically:

@steimelchrome, this is not exactly how Chrome is implementing these APIs. I am hoping it is not departing too much that this is ok. Thoughts?

This might be helpful for a webpage that has gotten out of sync

I do not think this is mandatory to settle this particular point, we can discuss it as a follow-up once we agree on the interaction between track muted and media session API.

Difficult to sync/Out of sync scenarios might happen with tracks being stopped in workers, setActive being asynchronous, and calling getUserMedia concurrently with these two. It might have been better to reuse play/pause model instead of a single toggle but this might be too late.

youennf commented 10 months ago

Discussed in today's media WG meeting. Plan is to:

chrisn commented 10 months ago

Minutes from 9 January 2024 Media WG meeting: https://www.w3.org/2024/01/09-mediawg-minutes.html

guidou commented 9 months ago

For VC applications it is necessary to know/specify what microphone or camera the event/action refers to. Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.

guidou commented 9 months ago

For VC applications it is necessary to know/specify what microphone or camera the event/action refers to. Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.

Filed issue #317 to track this.

youennf commented 9 months ago

Fixed by https://github.com/w3c/mediasession/pull/313 and https://github.com/w3c/mediasession/pull/312