w3c / mediacapture-output

API to manage the rendering of audio on any audio output device
https://w3c.github.io/mediacapture-output/
Other
26 stars 25 forks source link

Switch to transient activation. #118

Closed jan-ivar closed 3 years ago

jan-ivar commented 3 years ago

Fixes https://github.com/w3c/mediacapture-output/issues/107. cc @domenic for the link to activation notification to see if that's right.


Preview | Diff

domenic commented 3 years ago

Hmm, this seems pretty irregular... the phrasing in https://html.spec.whatwg.org/multipage/interaction.html#activation-notification seems to imply that activation notification is for "activation triggering input events" and not for specs to fire at will.

Is this how things are implemented? Is the intention that the user choosing an output device will automatically allow the page to do other activation-gated things like window.open() or fileInput.click()??

youennf commented 3 years ago

The idea is as follows:

domenic commented 3 years ago

It sounds like in that case you need a different mechanism.

E.g.: give each Window a "is finishing select audio output" boolean, separate from user activation. And set that to true after the user makes their selection, and false after some number of tasks and/or milliseconds. Then, in any place that plays audio on the platform, allow that boolean to override the user activation requirement.

jan-ivar commented 3 years ago

"activation triggering input events"

@domenic This is a promise resolved by user interaction. Is that not covered?

I wonder if there's a general problem here with (permission) prompts interfering with transient activation, since it takes time for users to answer prompts. Was this considered when we switched to a time-based activation model?

We might want to avoid creating situations where things may work fine as long as permission is persisted, but break if there's a prompt. E.g. This might make it impossible to put an existing feature behind permission in a web extension.

E.g.: give each Window a "is finishing select audio output" boolean, separate from user activation.

Sure, but that sounds like recreating a transient activation mechanism for media. We'd probably want a broader name, and reuse this for getDisplayMedia, as well as getUserMedia pending https://github.com/w3c/mediacapture-extensions/issues/11.

domenic commented 3 years ago

@domenic This is a promise resolved by user interaction. Is that not covered?

No; as you can see from the relevant sections of the HTML Standard, there is no special treatment for promises.

I wonder if there's a general problem here with (permission) prompts interfering with transient activation, since it takes time for users to answer prompts. Was this considered when we switched to a time-based activation model?

Yes. Existing APIs usually handle this in a few ways:

You seem to have a special case new situation where you want user activation required both for showing the permission request, and then using the resulted granted permission afterward. If you want to share the user activation across both things, then a special-case mechanism like I described above makes the most sense.

Sure, but that sounds like recreating a transient activation mechanism for media. We'd probably want a broader name, and reuse this for getDisplayMedia, as well as getUserMedia pending w3c/mediacapture-extensions#11.

Maybe, if those things also require a second user activation to start playing the media. I'd be surprised if they did; I suspect that if the user has granted you screen sharing or video call permissions or whatever, that the browser isn't going to need a further user activation restriction on actually using those. But this is outside my domain and you would know better.

jan-ivar commented 3 years ago

You seem to have a special case new situation where you want user activation required both for showing the permission request, and then using the resulted granted permission afterward.

It's not a special case, but one that arises naturally from connecting a source that requires transient activation to obtain, to a sink that requires autoplay. The latter isn't standardized, so we don't have a way to define interactions with it.

... getDisplayMedia, as well as getUserMedia ...

Maybe, if those things also require a second user activation to start playing the media. I'd be surprised if they did; I suspect that if the user has granted you screen sharing or video call permissions or whatever, that the browser isn't going to need a further user activation restriction on actually using those.

Yes, all browsers appear to waive needing to click on the page to play audio if the page has been granted camera or microphone permission. But it's not written down anywhere, and it's not obvious (it lets me play any sound, not just "those").

Since users shouldn't need to grant mic permission to switch speaker outputs, this spec defines a new in-browser picker API that requires transient activation to invoke, that is trying to say that autoplay must work afterwards.

We don't know if autoplay is sticky, and we don't know whether the transient activation duration is long enough for a user to have time to respond to a prompt.

I suggest we remove the problematic sentence for now and leave a note about autoplay.

youennf commented 3 years ago

This is a promise resolved by user interaction.

Not always, the UA may decide to resolve the promise. Even without user prompt, we still want media elements to be able to start playing.

I would prefer to not rely on activation duration which can be fragile/flaky and instead define precise semantics that always work.

jan-ivar commented 3 years ago

I would prefer to not rely on activation duration which can be fragile/flaky and instead define precise semantics that always work.

We don't know that we are relying on it. It seems premature to be precise since we're guessing how autoplay is implemented.

Autoplay may very well be defined to be sticky — which appears to be the case in all implementations I've tested — in which case we don't need to say anything. It might be better to note our intent here for whomever ends up specifying autoplay.

But we do know we want this API under transient activation, so I'd like to unblock this PR since we're working on code to that effect in Firefox.

youennf commented 3 years ago

which appears to be the case in all implementations I've tested

In Safari, autoplay is not sticky in general. We did carve out an exception when document is capturing audio and/or video.

But we do know we want this API under transient activation, so I'd like to unblock this PR since we're working on code to that effect in Firefox.

Sure. I would prefer to stick with some normative wording about allowing autoplay, even imprecise. A note could acknowledge the wording is imprecise. For instance, we could refer to https://html.spec.whatwg.org/multipage/media.html#allowed-to-play.

jan-ivar commented 3 years ago

I would prefer to stick with some normative wording about allowing autoplay, even imprecise. A note could acknowledge the wording is imprecise. For instance, we could refer to https://html.spec.whatwg.org/multipage/media.html#allowed-to-play.

@youennf Ok, I've updated it to do that. PTAL. I added a similar reference to web audio. cc @padenot to see if that's works.

domenic commented 3 years ago

Looks good from my perspective, FWIW.