Closed youennf closed 9 months ago
This came up as part of https://github.com/w3c/mediacapture-extensions/issues/39.
To me, it makes more sense that the action callback is executed first and the mute/unmute events second. If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.
Maybe it would help for MediaSessionActionDetails to contain more information, something like:
The first bullet is also related to the question we have about how setMicrophoneActive
and setCameraActive
are expected to be used.
There is currently no relationship AFAIK.
When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events.
I don't think this describes any browser today.
I've confirmed this with this fiddle which reacts to the buttons in Picture-in-Picture mode (which end-users might be surprised to learn are chrome buttons).
Tracks are never muted in Chrome (easily observed by commenting out the JS that clears audioTrack.enabled
and videoTrack.enabled
). IOW, these chrome buttons are 100% webpage-controlled.
Chrome is the only browser to implement togglemicrophone
and togglecamera
, so I was unable to test Safari or Firefox.
That said, a UA COULD mute tracks here (since it can mute at any time), as I mention in https://github.com/w3c/mediasession/issues/279#issuecomment-1846023701.
The first decision such a UA would need to make would be whether to trust the webpage to maintain the states, and simply enforce them with mute/unmute.
Not trusting invites the double-mute problem.
Trust doesn't seem like a big deal in the PiP example, but that might be deceiving, as many end-users might not even consider them chrome buttons in the first place.
If we instead consider Safari's pause feature in the URL bar:
...then it seems less obvious that the webpage should control it. E.g. Safari might wish to A) keep it as a double-mute, or B) use heuristics on unmute based on how the user muted (from the web page or the chrome).
To that end, it might be useful to fail the setters and return a promise to allow prompting. E.g. (proposal):
// API modification proposal
try {
await navigator.mediaSession.setMicrophoneActive(true);
// unmute succeeded
} catch (e) {
if (e.name != "NotAllowedError") throw;
// unmute denied
}
UAs might also take transient activation into account.
To me, it makes more sense that the action callback is executed first and the mute/unmute events second. If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.
That seems the most deterministic, since mute/unmute may fire for other reasons.
Maybe it would help for MediaSessionActionDetails to contain ... something like: ...whether the action is about muting or unmuting
This might be helpful for a webpage that has gotten out of sync, but also seems redundant until I understand how they can get out of sync.
@jan-ivar, I think we are mostly aligned, basically:
setMicrophoneActive/setCameraActive
to try muting/unmuting tracks, with UA specific privacy mitigation heuristics (transient activation, prompts...).@steimelchrome, this is not exactly how Chrome is implementing these APIs. I am hoping it is not departing too much that this is ok. Thoughts?
This might be helpful for a webpage that has gotten out of sync
I do not think this is mandatory to settle this particular point, we can discuss it as a follow-up once we agree on the interaction between track muted and media session API.
Difficult to sync/Out of sync scenarios might happen with tracks being stopped in workers, setActive being asynchronous, and calling getUserMedia concurrently with these two. It might have been better to reuse play/pause model instead of a single toggle but this might be too late.
Discussed in today's media WG meeting. Plan is to:
Minutes from 9 January 2024 Media WG meeting: https://www.w3.org/2024/01/09-mediawg-minutes.html
For VC applications it is necessary to know/specify what microphone or camera the event/action refers to. Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.
For VC applications it is necessary to know/specify what microphone or camera the event/action refers to. Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.
Filed issue #317 to track this.
When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events. It would help to define whether the mute/unmute events are fired before or after the corresponding MediaSession action callback.