Allow apps to avoid riskier display-surface types

eladalon1983 commented 1 year ago

Some video-conferencing tools offer admin settings that allow tuning what the user may do. For example, some admins allow sharing of screens when all participants are employees, but restrict to window-sharing in calls that have external participants (under the assumption that accidental leaks would be more costly in such scenarios).

Native apps allow such fine-grained control. Web apps... almost. Web apps can call getDisplayMedia() and pray. If the user chooses to share a monitor, these apps can stop the capture and chide the user for making an improper choice - a bad experience for all involved (remote participants included, as they sit and listen to the user complain about the Web app).

It would be better if we allowed Web apps to remove the monitor option from what is offered to the user. We could go with an API shape that's similar to the existing selfBrowserSurface option.

enum MonitorTypeSurfacesEnum {
  "include",
  "exclude"
};

dictionary DisplayMediaStreamOptions {
  MonitorTypeSurfacesEnum monitorTypeSurfaces;
};

Note that I propose a shape that only allows removing monitors, which is the riskiest option. Whether windows should be exclude-able is its own discussion. I think that if we ever decide that these too should be exclude-able, we could specify that one cannot exclude windows without also excluding monitors, lest the user be nudged towards monitors.

If we also specify getSupportedOptions(), apps would be able to know ahead of time whether such shaping is possible, btw.

CC @Coread, who is interested in such a mechanism. (He might have different opinions about the API shape and some finer points, though. Please let us know.)

It should be mentioned that discussions in this vein arose somewhat often. I believe this particular variant has not been rejected before. The latest discussion on an adjacent topic was #209, which culminated with the aforementioned selfBrowserSurface being specified - a success-story which I hope to repeat.

eladalon1983 commented 1 year ago

What do you think, @jan-ivar and @youennf?

Coread commented 1 year ago

It should be noted that this is already a use case that is provided for in a non-dynamic way in Chrome with the policy ScreenCaptureAllowedByOrigin This addition would simply allow this to be dynamic, decided by the application.

A question came up in the WebRTC working group. How do we make this work with the UI such that the user can easily understand what is going on? i.e. in other places they can share their screen, but now they can't. Simply hiding the Entire screen tab might not indicate what is going on. However, disabling the tab, with a tooltip indicating the reason <site> is not allowing you to share your screen would probably provide enough context to the user.

eladalon1983 commented 1 year ago

How do we make this work with the UI such that the user can easily understand what is going on?

This concern was raised by Tim (@steely-glint). While I think that Colin's (@Coread) proposal is reasonable, there's no way for us as spec authors to decide that for any implementation. I think that we need to take a step back and discuss whether UX is even a reasonable question to debate.

Specs levy certain requirements on UX. For example, our spec mandates:

For the newly created MediaStreamTrack, the user agent MUST NOT capture the prompt that was shown to the user.

So I acknowledge that UX is somewhat within the realm of specs, but I don't think the concern Tim raised falls within that realm.

Further:

If a developer is concerned that the UX employed by a browser is confusing, then that developer can avoid invoking the new API and file a bug.
If a browser implementer is unable to come up with what they believe is a reasonable UX, they are free to not implement this preference, which is intentionally structured as an optional hint, as was selfBrowserSurface.

@jan-ivar, you raised the question of default value. I think we should specify it as "include", which is the default behavior on all existing implementations. I'd also be open to not specifying a default, as we have done with a few similar preferences.

@youennf, you raised the question of dynamic switching in macOS. I think it'll be beneficial for Safari to have the flexibility to block dynamic switching from a window to a screen if the app requests that. But you're not compelled to, as this is a preference/hint. Wdyt?

dontcallmedom-bot commented 1 year ago

This issue was discussed in WebRTC June 2023 meeting – 27 June 2023 (Issue #261 Allow apps to avoid riskier display-surface types)

steely-glint commented 1 year ago

I think UX is within the realm of the spec to this extent : I need to be convinced that it is possible to create a UX that is not confusing to users and developers. I don't need the UX to be in the spec, but I need to sure it is at least possible - which I am not yet.

We have examples of 'unpredictable' APIs which depend on complex layered rules that turn out badly for the user.

I am specifically thinking of the way that chrome does autoplay here - the rules are so opaque it is a continuous surprise to users and developers when/if it works. Which (of course) disadvantages sites with lower footfall.

eladalon1983 commented 1 year ago

I need to be convinced that it is possible to create a UX that is not confusing to users and developers. I don't need the UX to be in the spec, but I need to sure it is at least possible - which I am not yet.

Chrome could show the monitor-picking tab, but remove the actual monitor thumbnails, replacing them with some clarifying text.

But that this is possible, does not mean that Chrome would choose to go with this option, let alone that other browsers would find an analog. Such guarantees could not be made to spec authors.

I think it's evident that something clear is possible, and whether something better could be accomplished, or whether the right trade-off with other considerations is achieved, is out of scope for us. Let each browser choose whether and how to implement this preference, and let each developer decide whether and when to invoke it.

steely-glint commented 1 year ago

I think that is exactly the sort of UX that will promote a flood of calls to the support desk saying "screen share stopped working" . It gives the user no clue why this isn't working the way it did 5 mins ago (and the reason is that a new staff member has joined the call but isn't on IT's list of staff yet).

eladalon1983 commented 1 year ago

It's up to the Web application to wield new APIs intelligently.

youennf commented 1 year ago

Looking at the issue, there seems to be two concerns:

Users may pick the wrong surface. My understanding is that a positive preference (prefer window, prefer tab) will often be sufficient. We talked about this in the past, maybe we should revisit this?
Users may dynamically switch to the wrong surface. Autopause is probably sufficient here?

At least on macOS, the OS is more and more in charge of the sharing UX. This is good since it means there will be the same UX whether using a native app or a web page. This gives consistency and is an answer to @steely-glint concerns. But this is relatively new territory. In general, it is easier/safer for us to design the right APIs once OS support is ironed out.

eladalon1983 commented 1 year ago

Users may pick the wrong surface. My understanding is that a positive preference (prefer window, prefer tab) will often be sufficient. We talked about this in the past, maybe we should revisit this?

That discussion was concluded and the PR was merged.

I don't think a positive preference for window indicates anything about willingness to also accept a monitor. I think we need a clear signal to exclude monitors, independent of other preferences.

Users may dynamically switch to the wrong surface. Autopause is probably sufficient here?

Why put obstacles in the users' path, by offering them an option that will be rejected by the Web application? Note that many users do not clearly distinguish the (a) Web application, (b) browser and (c) operating system. It's better when all work in concert on behalf of the user.

At least on macOS, the OS is more and more in charge of the sharing UX. This is good since it means there will be the same UX whether using a native app or a web page.

Without a mechanism by which an app could relay its preferences to the operating system through the browser, none of the changes made by macOS will...

...protect the user from the pitfall of choosing a source which the app will reject.
...protect companies from leakage of private information through employee-error.

But this is relatively new territory. In general, it is easier/safer for us to design the right APIs once OS support is ironed out.

Future OS APIs notwithstanding, there are the current macOS, Windows and Linux APIs, and the proposed preference would help with all of them.

jan-ivar commented 1 year ago

I think this proposal is a good idea. While I normally don't like restricting user choice, in this case, the restriction seems well-motivated to remove an unsafe choice, which I think outweighs my other concerns. So I'm supportive of this.

youennf commented 1 year ago

Given the proposal is that this is only a hint, web sites should in theory still need to check whether the surface is screen or not. They will probably not do if browsers shipping this property first forbid users to select screen (instead of preventing users).

I still prefer we positively let websites express their interests, by telling they are interested in 'window, tab', which means no interest in screen. UAs are then already allowed to put some additional brakes on user selecting screen.

If we are no longer talking about a hint but a requirement to remove screen, this is a different discussion.

eladalon1983 commented 1 year ago

They will probably not do if browsers shipping this property first forbid users to select screen (instead of preventing users).

Could you please explain the distinction between "forbid" and "prevent" in this context?

I still prefer we positively let websites express their interests, by telling they are interested in 'window, tab', which means no interest in screen.

That API shape would mislead developers into expecting that they can influence the choice offered to the users beyond what they actually can. Assume valid options are ['browser', 'window', 'monitor'] and the developer specifies ['monitor']. This is too risky; Chrome and Firefox[*] will ignore it and offer the user tabs and windows too. Won't developers find this surprising and start filing bugs? My experience with the Chromium bug tracker says yes...

If we are no longer talking about a hint

We are most definitely talking about a hint, as in the case of the previous options we've introduced of the "include"/"exclude" shape, after which this is modelled. See for instance selfBrowserSurface, where the spec outright says: "The user agent MAY ignore this hint."

[*] Or am I wrong, @jan-ivar?

eladalon1983 commented 1 year ago

Hi @youennf, have you had time to consider this?

youennf commented 1 year ago

Could you please explain the distinction between "forbid" and "prevent" in this context?

forbid is UA not showing the user any option. prevent would mean web app stopping the track if the surface is not of the right type.

Won't developers find this surprising and start filing bugs?

We could make it clear that ['monitor'] should be ignored if it is the sole value as a hint.

youennf commented 1 year ago

I think it seems good to spec, provided there is clear enough guidance that web sites need to check the result as this is only a hint.

w3c / mediacapture-screen-share

Allow apps to avoid riskier display-surface types #261