w3c / picture-in-picture

Picture-in-Picture (PiP)
https://w3c.github.io/picture-in-picture
Other
311 stars 42 forks source link

Should include / require spoofing protections #182

Open pes10k opened 4 years ago

pes10k commented 4 years ago

As implemented in Chromium browsers, there is currently no browser chrome to cleanly distinguish a PinP window from other, non browser windows. Further, bc the PinP window floats over everything else, it looks privileged and is not obviously tied to a browser / frame / context.

This opens up the ability for websites to spoof privileged system dialogs, along with other non-browser applications, that the user may have different trust levels with then a browser page.

The current restriction to <video> mitigates the worst cases (e.g., "MacOS needs your password" style forms), but 1) it seems possible to spoof / confuse with just video events, and 2) the spec says that arbitrary HTML may be forthcoming.

The spec should require implementors to add unambiguous, un-page-reachable decoration to the PinP window, so that users can clearly know 1) that its a PinP window and not any other kind of dialog, and 2) which browsing context owns it.

Apologies if there are already such affordances in the text, though on two read throughs I don't see any thing in this area.

beaufortfrancois commented 4 years ago

Thanks for your feedback!

The only text we have is quite short indeed.

The API applies only to HTMLVideoElement in order to start on a minimal viable product that has limited security issues. Source: https://w3c.github.io/picture-in-picture/#security-considerations

@mounirlamouri We've talked a lot about spoofing when thinking about arbitrary HTML in Picture-in-Picture. What are your thoughts on elaborating on this topic in the current spec?

pes10k commented 4 years ago

Sounds good. Also, I appreciate that limiting to

mounirlamouri commented 4 years ago

I don't think at the moment these concerns are likely to lead to something concrete. The website wouldn't receive mouse events, hover, focus, or blur on the PIP window, thus making a lot of potential abuse very hard to do. When the user hover the UI, the UA would show its own media controls, making it very hard for a website to make it look like something it isn't.

We can add language to detail why we think this is entirely fine but I would rather not add any requirements like attributions that do not feel needed atm.

guest271314 commented 3 years ago

PiP window can be spoofed by capturing the PiP window with navigator.mediaDevices.getDisplayMedia({video: true}) as an "Application" and creating a new window without scroll, title, and navigation UI programmatically unset, and placing focus on the new window, displaying the captured stream of the original PiP window in a <video> which can receive events on both the window and the <video> element.

@pes10k

One ironic "fingerprinting" vector of PiP at Chromium is the implementation of the recommendation of a maximum PiP window size, which means that an "attacker" can determine the maximum screen width and height of the device by expanding the PiP window to maximum dimensions. Perhaps that is not a concern, as that is possible using other means, yet still a way to "fingerprint" device screen maximum dimension using PiP. That particular method of "fingerprinting" is not necessarily possible at Mozilla browsers implementation of PiP, which does not follow that recommendation.

mounirlamouri commented 3 years ago

screen.height and screen.width will give you the screen size. I don't think the Picture-in-Picture API is exposing anything new.

guest271314 commented 3 years ago

@mounirlamouri Yes, that is true, as indicated

Perhaps that is not a concern, as that is possible using other means, yet still a way to "fingerprint" device screen maximum dimension using PiP.

It is also true that PiP controls distinguish PiP window, a potential guard against PiP window being spoofed, unless the "Back to tab", "Play", "Pause", "Close", "Play from the beginning" controls are spoofed, in which case clicking on spoofed controls could be an attack vector if user clicks the spoofed controls, where only one click is needed to launch the exploit, by the time user clicks spoofed controls it is too late.

Am not sure how a PiP window could be constructed via UI which could not be spoofed.

Chromium and Firefox provide their own version of UI notification for navigator.mediaDevices.getUserMedia() and navigator.mediaDevice.getDisplayMedia(). getDisplayMedia() notifications are both conspicuous - as long as fullscreen is not entered, then the conspicuousness of the notifications at implementations can become inconsistent.

If the PiP controls themselves are consistent and reliable then this issue should not be a concern. However, controls are not necessarily reliable in all cases https://bugs.chromium.org/p/chromium/issues/detail?id=1107027, and specifications do not ordinarily mandate specific UI. If the controls were in fact considered reliable then there would be no need for this issue or the recommendation to restrict PiP window dimensions. Since no reason has ever been given for that language in the specification the only conclusion that can rationally be drawn is that those concerns are the same as the ones raised in this issue; along with perhaps PiP window becoming irresponsive when covering the entire screen, which is capable of being exited and returned to at by CTRL+ALT+F2 and CTRL+ALT+F7 at Linux.

One option to address the concerns at OP would be to specify that a unique watermark - that is not recorded if the PiP window is captured by MediaRecorder - be placed in a corner of PiP window, which scales when the PiP window size changes so that the user is always aware that PiP window is open and genuinely the PiP window the user opened.

The watermark can be as simple as the user-agent text or timestamp of the initiated PiP session using user-selected font for disambiguation and certainty; a user-supplied image settable at Settings; or default icon in the configuration folder of the browser.

If this issue is not really a concern nor should a restriction on PiP window be a concern, as the same rationale applies, or does not apply.

marcoscaceres commented 3 months ago

So, it can be made to confuse users, but it's not interactive in any meaningful way. To @mounirlamouri points above, it seems like adding such UI requirements would be too extreme. That's not to say that a browser couldn't put such mitigations in place to protect their own users, but it seems onerous to impose that on every UA.

I'm personally not sure there is anything actionable here. We can mention that it could be used to spoof things, but it's not interactive so there's not much an attacker could do.

chrisn commented 3 months ago

the spec says that arbitrary HTML may be forthcoming

This won't be in scope for this API, so we'll update the spec and/or explainer as needed to clarify.

pes10k commented 3 months ago

the spec says that arbitrary HTML may be forthcoming

I agree, that if this capabilitiy is removed (and not forward promised) then most of the concern goes away here, and i'm fine closing the issue one the change is in