Open jan-ivar opened 3 years ago
It is unclear to me why you'd want to limit ID-exposure to cross-origin isolated sites. Making it opt-in seems sufficient to me. Why not decouple your proposal for isolated-browser
from that of a capture-ID?
I think they're coupled, because we haven't sufficiently isolated (not pun intended) the use case of sharing and controlling an app from that of sharing and controlling its tab and everything in it (including its bfcache and anywhere new you go).
This is oversharing. And it can bite people, because it deceptively feels like sharing an app in the moment, even to tech savvy users who should know better. Adding slide controls on top of this would further lull users into this misconception.
But this car has no seat-belts: hit the back button one too many times and you've accidentally shared what you were browsing right before the meeting to the entire meeting (and if it was a WG meeting, it is now on YouTube). Why is this acceptable?
To me, it seems we should be able to safely offer, say — and I'm just picking products out of a hat here — the ability to share a Google Slides presentation in Google Meet, without the risk of accidentally sharing anything else.
With this explained, I'll amend this proposal to consider freeze-and-prompt on "safe" cross-origin navigation as well.
controlling an app
I think it pays to be exact when discussing potentially-alarming concepts like "control" of an app. The capturing app does not "control" the captured app. It is sending it messages. The captured app is not compelled to react to these messages, or even to read them. When a captured tab is navigated, no power is coercing the newly loaded application to continue the collaboration that the previous app started. I think we should keep it crisp and clear in our minds that there is no concern of residual control surviving navigation of the captured tab.
We can't share an app, so we share the next best thing: it's tab.
This is an entirely different, pre-existing issue, and I think it should be addressed separately.
Adding slide controls on top of this would further lull users into this misconception.
This is a valid concern, but I do not find it compelling. The reverse could also be argued - that by keeping the user away from the captured tab, unintended navigation by clicking back
too often becomes far less likely. By establishing collaboration and keeping the user focused on the capturing tab, we emulate the properties of app-capture and reap its benefits. These benefits might be more firmly established once we also provide actual app-capture, but I see no reason to delay progress here.
and I'm just picking products out of a hat... Google Slides... Google Meet...
Google gives away too much swag. Our hats are everywhere. 😉
controlling an app
Apologies if I was unclear. I'm describing user experience. When everything works, users are controlling what appears to be a shared app, and will form a mental model around what they are sharing based on that (they click "Present", find their presentation by name or thumbnail, share "it", and control "it").
This is a desirable experience, and I don't fault services for wanting to provide it. But I'd fault us for providing it without first addressing the security misconceptions underlying it:
The security just isn't there to provide this experience safely just yet. The only tip-off to the lack of seat-belts is some tech prose in the prompt, and maybe some pause in a user's mind about why their choice seems buried in the picker. This is still cooking with a blowtorch, and I think it's premature to add a kitchen timer to it.
The reverse could also be argued - that by keeping the user away from the captured tab, unintended navigation by clicking back too often becomes far less likely.
An interesting argument, but this might also make users think the audience sees only the actions they make in the VC tab, not the target tab. Unless you're planning to freeze direct interaction with the target tab during capture, I think this is a net loss addressing oversharing concerns.
Also, convergence around site-isolation and capture opt-in is far from certain, so being able to identify features that can drive adoption, I think is good. Especially when they fit.
The security just isn't there to provide this experience safely just yet.
Citation needed.
An interesting argument, but this might also make users think the audience sees only the actions they make in the VC tab, not the target tab.
Using Capture Handle, VC applications could immediately detect when navigation occurs away from a trusted capture-target. VC applications could use this knowledge to pause remote-sharing and prompt the user to confirm they'd still like to share. This is a partial solution that could be delivered in 2021. You have presented two counter-proposals, getViewportMedia-based-Web (you estimated 2023) and this proposal (I am guessing >=2022 at the earliest). Out of concern for users' well-being, I think we should proceed with Capture Handle as quickly as possible. Surely you agree?
Using Capture Handle ... VC applications could use this knowledge to pause remote-sharing and prompt the user to confirm they'd still like to share. ... This is a partial solution ...
That's interesting, but I'd like to see this solved in the user agent, and not leave it up to individual sites to get right.
What if the browser did this instead? Would you be on board with that?
(you estimated 2023)
I did not. I picked 2023 in my crystal ball slide as a frame of reference far enough out to avoid arguments over claims like "All major presentation websites use site-isolation and opt into html capture". The point of such slides is to induce hindsight: imagine this has happened, then review whether interim proposals were steps in the right direction or shortsighted or even counter-productive.
I feel we've made good progress on defining site-isolation and capture opt-in over the last few months, so I'm bullish on it being spec'able and implementable long before then, if we can agree on where to apply it.
That's interesting, but I'd like to see this solved in the user agent, and not leave it up to individual sites to get right.
What if the browser did this instead? Would you be on board with that?
I've proposed some similar things in the past. For example, I've suggested a pauseCaptureOnNavigation
constraint. So I would be interested in discussing such solutions. But the devil is in the details. Importantly, since we're not going to deprecate vanilla tab-capture through getDisplayMedia, I am still interested in solutions that target that flow. Even partial solutions, since this will be the common tab-capturing use-case for users for some time to come, not isolated-browser
.
I am on board with discussing both short- and long-term solutions, so long as the short-term is not ignored on account of aspirational goals for the long-term. I am on board with decoupling Capture Handle from isolated-browser.
I feel we've made good progress on defining site-isolation and capture opt-in over the last few months
We've been discussing getViewportMedia since 2020-10-07. Today marks 8 months. Yet we still have some non-trivial outstanding issues on which the discussion is progressing very slowly. I am concerned 2023 might be an optimistic estimate. Honestly.
Use case
A participant in a video conference (website A), can safely present slides in a presentation-website B open in another tab or window, without exposure to privacy risks from oversharing, or threats to the web’s same-origin security model. This presenter can advance (next/previous) slides from within A or B.
Safety must be the default, and to the extent a user agent still offers unsafe choices (to pick an unsafe source or follow an unsafe link in a safe presentation) they are clearly distinguished with warnings from the user agent that they carry additional risks over and above the safe ones. Unsafe choices do not contain integration features.
Proposal
Add a new display-surface:
“isolated-browser”
display surface is the rendered form of a browsing context where the current top-level document is site-isolated and has opted into capture. Capture survives navigation to other pages that are also site-isolated and opted into capture. But upon any other navigation, the user agent MUST freeze capture on the last safe frame, and MAY prompt the user with a warning and option to allow capture of the unsafe content. Capture will resume once the browsing context is navigated back to safety, unless the user answered affirmatively to the prompt, in which case the source turns into a“browser”
display surface.Sources of this type MUST be given preferential placement in
getDisplayMedia()
’s picker over their unsafe counterparts.Site-isolated pages that have opted into capture, can get an id that matches the id exposed on
“isolated-browser”
tracks (but unlike the 🔮 slide, there's no requirement to register for preferential placement). This is to ensure that sites opt into building web-integrated capture-related features on this safer foundation, instead of the unsafe one we have today.