w3c / mediacapture-extensions

Extensions to Media Capture and Streams by the WebRTC Working Group
https://w3c.github.io/mediacapture-extensions/
Other
19 stars 14 forks source link

Detecting user-actionable camera issues (e.g., camera shutters) #140

Open guidou opened 1 year ago

guidou commented 1 year ago

Devices sometimes allow users to disable a camera via various means (e.g, physical camera shutters, ). In many cases users want to use the camera but have forgotten that they have disabled it by using a shutter, for example. This is often detectable at the system level via some APIs, but on the Web it is not visible except as black frames or perhaps as a muted track. Since tracks can be muted for any reason, an application cannot reliably provide feedback to the user with an explanation and possible ways to fix the problem. Black frames may also be produced by other non-system-level reasons.

It would be good to address this use case on the Web platform. A possible way would be to add an optional field to the mute event with a reason for the mute. Another possibility is to change the muted property from bool to an enum, or add a new property or event to avoid breaking compatibility.

eladalon1983 commented 1 year ago

This might be relevant: https://github.com/w3c/mediacapture-region/issues/9#issuecomment-1022334881 TL;DR: Another use for a MuteReason/MuteCause/mute-as-enum.

Also, this issue could also benefit of a mute-cause if the approach is reshaped slightly.

eladalon1983 commented 1 year ago

Your thoughts about multiple concurrent reasons, btw? (See first link.)

youennf commented 1 year ago

From the user's perspective, the main issue is probably if user starts capturing with a camera that is shuttered. In that case, couldn't it be the UA that would provide the warning to the user? That would seem more robust than expecting every website to handle that case.

If user is using the shutter in the middle of a capture, user will probably not forget to remove it when needed. I also somehow wonder whether user would like websites to know they activated the shutter in that case.

eladalon1983 commented 1 year ago

The difference between the UA and the app is in information.

The UA always needs to be careful to not overcommunicate to the user. Should the user be informed that a shutter is active? How prominent should this warning be? A sane warning will invariably be discreet enough to be missed by some users.

The app, on the other hand, knows if camera-interaction is key to the current workflow. It can afford to communicate the presence of the shutter loudly and intrusively, in a way that cannot be missd.

youennf commented 1 year ago

It really depends of the flow and API shape we expose, let's assume the following:

  1. If camera is shuttered, track is muted.
  2. UA exposes the requestUnmute method we talked about (it might be good to make progress there).

Let's say that a web page is capturing but the camera track is muted. The user clicks on the unmute button in the web page which calls requestUnmute.

In that particular case, the UA could tell the user what to do to unmute the track (unshutter the camera or provide appropriate information for that particular user setup).

I am not sure it is always best to expose muted reasons given they are open ended and/or OS specific. It might make it hard for web pages to handle all cases/all platforms properly.

jan-ivar commented 1 year ago

Part of the privacy appeal (for me) of physical camera shutters is that they're not known to apps, hence apps can't refuse to work until I remove them. Exposing a JS API to the app the shutter is designed to block, seems like it would undermine that. It therefore seems superior to me to let the user agent handle this, without exposing this to the app.

jan-ivar commented 1 year ago

UA exposes the requestUnmute method we talked about (it might be good to make progress there).

Is this https://github.com/w3c/mediacapture-extensions/issues/39? I'd love to make progress there.

eladalon1983 commented 1 year ago

apps can't refuse to work until I remove them.

Do you have a concrete example of such an app? (Other than patently malicious apps, where you should not allow cam/mic access to begin with.)

guidou commented 1 year ago

I would say that a shutter that provides an API intends applications to know that state. Real-world experience suggests that VC applications knowing that state would work to the benefit of the user.

Also, if the idea is blocking applications that want to use the camera, we have permissions for that.

jan-ivar commented 5 months ago

Do you have a concrete example of such an app?

No, because this isn't possible today, and I would like to keep it that way. 😉

If you're asking more generally about examples of apps refusing to work: many video conferencing sites refuse to let users into meetings unless they give up permission to their camera and microphone, even though they don't plan to actively participate in the meeting.

This gives me low confidence that apps will respect privacy screens if made aware of them.

(Other than patently malicious apps, where you should not allow cam/mic access to begin with.)

I reject the idea of dividing apps into patently malicious ones vs everything else, as this overlooks the role of the user agent to negotiate inherent conflicts between the goals of end-users and those of web applications. There's a lot of gray here. E.g. persisting permission early simplifies lots of things at a cost to privacy, vs. dealing with it later is more costly in complexity.

fippo commented 5 months ago

many video conferencing sites refuse to let users into meetings unless they give up permission to their camera and microphone, even though they don't plan to actively participate in the meeting.

Anecdotally (I don't think anyone ever backed this with data) most users in a video conference turn on their camera and microphone at some point. After all the goal of both the end-user and the web app is to facilitate a conversation. This is notably distinct from use-cases like webinars where you have a large group of silent listeners.

Permission is asked upfront as doing this in the middle of a call is disruptive goes against the users desire to speak now. And mind you that the cost of a meeting is determined by the number of users and number of minutes spent which is another reason to do this upfront. We are probably talking about at least 15 seconds, data like from the old https://medium.com/@fippo/getusermedia-prompts-ea912fba9e5d is underestimating the amount of time it takes since that only looks at the actual getUserMedia call, not the whole UX flow.

We have evidence in public data that users already failed to get the desired result from the existing dialogs, see the relatively large difference between getUserMedia and methods to add a track to the peerconnection during the first half of 2020)

Exposing muted-because-of-shutter will support the goal of the end user, "say something" so 👍