Enforcing user gesture for getUserMedia

youennf commented 4 years ago

Ideally getUserMedia would require a user gesture similarly to getDisplayMedia. This is not web-compatible as many pages call getUserMedia on page load or quickly after page load. It would still be nice to define web-compatible heuristics where user gesture could be enforced.

youennf commented 4 years ago

One case where we could enforce user gesture is the following:

User denied access to getUserMedia for a page, for instance at page load time or when entering a chat room
User wants to grant microphone access and clicks a button on the web page
getUserMedia is called as part of the button click handler.

We could try to enforce that a user gesture is required to call getUserMedia whenever user previously denied getUserMedia access for a given page.

jan-ivar commented 4 years ago

We could try to enforce that a user gesture is required to call getUserMedia whenever user previously denied getUserMedia access for a given page.

This (is permission-model specific, but) would be more permissive than today, where Safari and Firefox block temporarily, while Chrome blocks permanently.

Why would we remove the only disincentive to spamming on pageload?

To me, the case for user activation has to do with navigating to a room on a site already granted persistent permission. An interim solution there would be to ignore persistent grants when user activation is not used (always prompt). This would let sites migrate on their own time.

jan-ivar commented 4 years ago

Complication: I'm sensitive to the number of clicks it takes to get into a call (one vs two), and would like to not penalize privacy sensitive users more (or browsers with privacy sensitive defaults).

How about something like:

"The user agent MUST ignore persisted permissions unless the method was triggered by user activation"?

This would mitigate the live cam room dive, but also be highly backwards compatible (more prompts, but interestingly the same number of clicks).

alvestrand commented 4 years ago

If the browser supports querying for whether you have permission or not, the page can tell that it's not getting the device because permission is "denied" - at least when it's for all devices. In that case, it can point to browser UI (sigh) to reenable camera.

(Without enumerateDevices, he can't find the ID to figure out which specific devices to ask about, so if the individual device is denied, the page has a Hard Problem.)

If any page action allows the page to prompt after being initially denied, this offers the possibility of trapping the user in "request until allowed" loops - which is a Bad Thing. No needs to mean no.

dontcallmedom commented 4 years ago

VI discussion: user-agent dependent at this stage, no interest in standardizing behavior on this at this stage

youennf commented 4 years ago

Reopening issue. I think it is not a blocker for WebRTC 1.0 but working on adding such restriction (for temporary permissions or in case of room-dive-in) seems beneficial.

youennf commented 4 years ago

"The user agent MUST ignore persisted permissions unless the method was triggered by user activation"?

We could restrict this to something like: "The user agent MUST ignore any persisted granted permission unless the method was triggered by user activation"

youennf commented 4 years ago

If the browser supports querying for whether you have permission or not, the page can tell that it's not getting the device because permission is "denied" - at least when it's for all devices.

FWIW, we are contemplating allowing a web page to know whether calling getUserMedia will trigger a prompt or be granted. I am not sure we want to expose 'denied', as this could be a great fingerprint and would use 'prompt' for these pages.

alvestrand commented 4 years ago

My requirement is that if permission has been previously granted, it MUST be possible to enter a videoconference without a prompt. I don't know whether current VC products switch to a new page when starting or not, and whether the user gesture the user used to enter the conference is "used up" by the page switch or is still available to be consumed by the starting of the input devices. This needs to be clear.

youennf commented 4 years ago

My requirement is that if permission has been previously granted, it MUST be possible to enter a videoconference without a prompt.

That seems fine by me if we add the following words to your sentence "as long as user clicks on the page".

Is this fine with you?

I don't know whether current VC products switch to a new page when starting or not,

From my experience of websites supporting Safari, they do not switch to a new page. Reason might be that doing a navigation is time consuming, triggers a UI blank page... In Safari, this would also reprompt users. It is usually best practice to request access to privilege resources as close as possible to the time they will actually be used.

Several websites supporting Safari call getUserMedia when loading the page or shortly after loading the page without a user gesture. This usually ends up triggering a prompt in Safari, except if users opt-in to always grant. To keep being unprompted when user grants persistent access, these websites would have to adapt, for instance by using one of the following flow:

Call getUserMedia at the time user clicks the 'enter call' button.
Start the call with camera/microphone muted and have some UI giving an incentive for user to share mic/camera. Some websites like whereby.com would have nothing to change.

This needs to be clear.

As of determining whether a user gesture is still available after a page switch, I am not sure that the definition of 'user gesture' is consistent across browsers. I am not sure we can be very precise one way or the other. Safari implementation would not allow this for instance. I don't know what other browsers do. I know some ideas (maybe implementation as well) have been discussed to pass a user gesture through postMessage.

youennf commented 4 years ago

@jan-ivar mentioned the case of the potential issue of websites like jsfiddle getting persistent camera access, thus making any application able to potentially get camera access by navigating to a specific jsfiddle. This seems especially easy if Permission API returns 'granted'.

youennf commented 3 years ago

Iceboxed PR: https://github.com/w3c/mediacapture-main/pull/666

q-alex-zhao commented 3 years ago

Question:

The user agent MUST ignore any persisted granted permission unless the method was triggered by user activation.

How should this interact with the permissions API? I think the user agent should not resolve "granted" via Permissions.qeury() if it's ignoring persisted granted permission, right? Just making sure...

jan-ivar commented 3 years ago

@q-alex-zhao Using Permissions.query() to prime users works poorly in any browser but Chrome. Using a cookie is a more reliable way to detect users who need priming on how prompts work. See https://github.com/mozilla/standards-positions/issues/19#issuecomment-370087341

How should this interact with the permissions API? I think the user agent should not resolve "granted" via Permissions.qeury() if it's ignoring persisted granted permission, right? Just making sure...

In Firefox, we do not implement query for camera and microphone for the reason I gave. If we end up doing so, our plan is to return "granted" if the user has granted access to this site in the recent past, regardless of whether there will be a prompt or not.

By default, our prompts do not persist access, or permanently block your site if the user declines, and users who have granted access in the past are much less likely to block permission to sites, or be confused about how prompts work.

q-alex-zhao commented 3 years ago

our plan is to return "granted" if the user has granted access to this site in the recent past, regardless of whether there will be a prompt or not

Wouldn't that be counter-intuitive, when the permissions API says one way and the actual browser behavior is the opposite?

Or maybe the user gesture requirement should work similar to the focus requirement, i.e. have getUserMedia wait for user gesture?

jan-ivar commented 3 years ago

Wouldn't that be counter-intuitive, when the permissions API says one way and the actual browser behavior is the opposite?

@q-alex-zhao The opposite of "granted" is not "prompt". I've opened https://github.com/w3c/permissions/issues/230 to clarify.

Or maybe ... have getUserMedia wait for user gesture?

That's an interesting idea, but it might not be obvious to users that others cannot see them, or that the fix is to click anywhere on the page.

q-alex-zhao commented 3 years ago

Another edge case came up:

Suppose the page has already started capturing media from a Camera1 device.
After some time the user decides to unplug Camera1.
There is also a Camera2 present on the system.
If the application were to call getUserMedia to continue capture from Camera2, should this prompt? There's no "user gesture" in the traditional desktop usage sense, and the unplugging of Camera1 may not be originating from an actual user action, since maybe the wiring is faulty...

Thanks in advance for the clarification.

jan-ivar commented 2 years ago

After some time the user decides to unplug Camera1.

There is also a Camera2 present on the system.

If the application were to call getUserMedia to continue capture from Camera2, should this prompt?

@q-alex-zhao Any gesture requirement isn't tied to prompts, but to the request (which might be granted without prompt).

So when you ask "should this prompt?", do you mean "should this prompt or automatically grant capture from Camera2, depending on whether the site has permission"? So far, only Firefox has per-device permissions, so the latter would be common.

We don't know why the user unplugged Camera1, but it may be a strong signal to stop capture. So requiring user gesture sounds like it would be an improvement in that case (to avoid someone thinking they can't be seen when they can, because they didn't unplug all their cameras).

w3c / mediacapture-extensions

Enforcing user gesture for getUserMedia #11