Closed rsolomakhin closed 4 years ago
The answer to
Does this specification enable new script execution/loading mechanisms?
is wrong. If you're creating a new kind of browsing context, you are in fact doing that.
Hi, just noting that I've not had a chance to review the proposal yet... still digging myself out from under a ton of TPAC email (so probably not accurate to have me as an author :)) ... just something we threw around during TPAC. Might be too early to be asking for TAG review.
@annevk is right, it does look like the answer to that question should be fixed.
@cynthia @annevk : I've modified the answer:
- Does this specification enable new script execution/loading mechanisms?
Yes, the specification enables opening a new top level context, similar to a popup.
@marcoscaceres you did the initial hard yards to get this started so credited you as an author (I think it's mostly still your proposal plus some extra content on previous efforts and justifications).
I'm happy to help as author... but... just... need... time... best for us to discuss this somewhere else :) Let's pick this up over on @adrianhopebailie's repo. Give me a few days tho :(
Proposed to WICG: https://discourse.wicg.io/t/proposal-modal-window/3982
Hi,
We discussed it during telecon today. It seems based on the questionnaire that security/privacy assessment process is ongoing (thanks for marking it so evidently, by the way, this is appreciated).
We wonder if you plan on updating the document anytime soon (particularly points 2, 5)?
@marcoscaceres : You wrote "Maybe" and "Unsure" for points 2 and 5. Do you have any new thoughts on those two points?
Some privacy concerns (re-raising from email conversations with authors):
1) The spec seems very susceptible to spoofing (I believe the authors were going to revise…). Being able to position the modal at the bottom of the window seems to make a "overlapping with the toolbar / URL bar" spoofing defense impossible. Even more so with the ability to go fullscreen, and for the context being able to resize the modal window.
2) The same issue raised against similar functionality in the Payment Handler API: The spec says that the modal window is a 1p context. This cuts against the privacy improvements being pushed by partitioning storage, Safari's ITP, Brave's storage and cookie, as it allows peer communication between 1ps, and would enable cross site tracking. The modal context should be 3p, and a nested context of the triggering page.
Hope this helps
@snyderp wrt to your first point, the explainer does say:
The browsing context (not the opener) controls the dimensions.
I think we can sharpen this up to be more explicit that neither the opener or child can control the size and position, only the browser. Would that alleviate your concerns, at least regarding your first point?
@adrianhopebailie re the first issue, im looking at item 3. Just want to make sure I'm reading the right text, since that seems explicitly say the opposite (and not be a matter of ambiguous text). Is this the correct text to be reading?
@adrianhopebailie re the first issue, im looking at item 3. Just want to make sure I'm reading the right text, since that seems explicitly say the opposite (and not be a matter of ambiguous text). Is this the correct text to be reading?
Yes, you are reading the right text. We should reword that.
About 1p vs 3p, we should probably fix that too.
About 1p vs 3p, we should probably fix that too.
@marcoscaceres By "fixing", do you mean Modal Window should be treated as a 3P context embedded from the opening page?
I don't think that makes sense because a key value of introducing something like Modal Window is to provide a 1P context that still looks somewhat in context with the original page. The challenge is to find the tricky balance between making it "in-context" enough to not incur the full user friction of a redirect or popup, but distinct from the original page enough that bad actors cannot use it to silently track users. I believe the user activation requirement and adding a URL bar (possibly read-only) to the modal window set a correct balance.
I believe Modal Window will not be very useful if it's made to behave like a 3P iframe. @asolove-stripe voiced a similar opinion in a different thread and can speak about the use cases from Stripe.
I am very much not an expert on browsers and standards, but I do have a couple userland thoughts on why treating this as more similar to a top-level page are useful:
Obviously, I understand concerns about this API being used for problematic cases that trigger security and privacy concerns. But I would love to see if you all can creatively support many of the use-cases above, so that third-party systems in payments and authentication can get rid of the much-worse status quo systems and replace them with secure modal.
@hober and I discussed this. My thoughts:
Authentication seems like an important use case, but "access to third-party services" seems too general to be meaningful, and we weren't sure how the sharing use case would intersect with the Web Share API.
See below... but yes, they would need associated with some kind of action or intent (e.g., "authenticate").
How would we avoid a scenario where users are being disrupted by modal windows showing the moment they interact with the page in any meaningful way?
It could be similar to share (where a user picks a share target), in that options could be shown before the actual modal dialog is presented.
Except, in this case, it would be related to the particular action (e.g., "authenticate").
It might also be helpful to clarify what material is intended to be implemented directly based on this work, and what is intended to be used by other specifications and implemented based on that. And for the latter material, it may also be useful to provide advice to the other specifications that would use this.
Just as reference, privacy aspects of the near-identical functionality in the Payment Handler API are being discussed here. https://github.com/w3c/payment-handler/issues/351
Main point of discussion is 1p vs 3p storage context…
I'm concerned with your (apparent, implicit) design decision to populate native UI (the modal sheet) with arbitrary web content. My concern is similar to those expressed by Marcos, Nick, and Pete. Let me try restating it.
Native UI for existing payment methods (e.g. the Apple Pay sheet in Safari) is difficult to spoof with web content, partly because the native sheet overlaps native browser UI and the web content. Adding a feature to the web which allows sites to populate something that looks like such a sheet with arbitrary web content makes it far easier for web sites to spoof such UI.
Given this concern, I don't think your answer to question 11 of the security and privacy questionnaire is sufficient:
11. Does this specification allow an origin some measure of control over a user agent’s native UI?
Yes, to some degree. It allows a website to create a new modal window context which makes the calling context inaccessible until the modal is closed. To mitigate abuse of this capability it is recommended that only a single modal can be open per parent context.
Nor is your rationale for preferring your current proposal to pop-ups:
Pop-ups are generally locked down and difficult to invoke reliably due to the measures introduced by browsers to counter their abuse[…]
Given their modal nature, we can’t yet think of a good way to abuse modal windows. The assumption being that only a single modal window will be allowed at a time.
In both of these cases, the "spam the user with modals" attack is considered, but the "spoof native UI for phising purposes" is not. In particular, some of "the measures introduced by browsers to counter their abuse" are to counter the use of pop-ups for phishing. Don't you anticipate browsers having to take similar measures for modal windows? If not, why not?
@hober IMO concerns about phishing native UI are unfounded. It seems trivial for browsers to implement modal windows in such a way that they cannot be mistaken for native UI by doing simple things like showing an address bar at the top of the window.
To use @marcoscaceres example, a native UI would replace that bar at the top with something that can only be rendered by the platform.
What is stopping a website from using a pop-up today that renders as close as possible to an Apple Pay sheet? This is arguably even more confusing since the window.open
API provides the caller even more control over the look and feel that we would want to give a modal window caller.
I'm not sure how the answer to Q11 can be changed to provide any information that isn't already there? The proposal is to give the calling site the ability to create a new context but not have any control over how it is rendered. The influence the caller has is the modality of the new context, so by calling the API they effectively make their own context non-interactive.
With PH API the calling context has indirect control over the URL that is rendered inside that context (i.e. it specifies which payment methods are supported and the user selects a payment handler to invoke that supports that payment method) but has no control over the size or position of the window or what is rendered around it to flag to the user that it's a Payment Handler context.
Don't you anticipate browsers having to take similar measures for modal windows? If not, why not?
Not at the API level, no. The way browsers render modal windows should handle this as I describe above.
I'd also note that "native UI" is rare (the Apple Pay sheet is unique to Safari) so it would be useful to enumerate what native UI you consider at risk of being spoofed.
@dbaron my original thinking was that this would simply be a new platform API that would ultimately replace window.open
however it seems that limiting invocation of this feature to specific contexts (payment, auth etc) may be a good way to alleviate some of the privacy and security concerns.
Therefor, perhaps rather than making the API available to Window it could be limited to same-origin workers only in specific contexts.
This would require that APIs used for other use-cases adopt the same pattern as the payment APIs whereby a calling context calls a use-case a specific API (e.g. credentials API) and if the service behind the API is from another origin (as opposed to the UA itself) the worker for that service is invoked. If the worker requires some UI it is able to invoke a modal window.
i.e. As a reminder of how PR API and PH API interact:
@rsolomakhin @marcoscaceres are we on the same page?
@hober IMO concerns about phishing native UI are unfounded. It seems trivial for browsers to implement modal windows in such a way that they cannot be mistaken for native UI by doing simple things like showing an address bar at the top of the window.
[...]
What is stopping a website from using a pop-up today that renders as close as possible to an Apple Pay sheet? This is arguably even more confusing since the window.open API provides the caller even more control over the look and feel that we would want to give a modal window caller.
I'm skeptical of users' ability to make the sometimes-subtle distinctions needed here as to what is "connected" to the browser UI and what is entirely within the page. (I'd be interested to see studies as to how effective it is and whether users can detect good spoofs.) Thus I'm somewhat skeptical of the idea of teaching users that this with this sort of sub-urlbar are safe... although I could also certainly imagine UIs that I would suspect are more likely to be effective at this than the screenshots I've seen (for either modal window or apple pay).
Given that I'm not confident of the effectiveness, I'm a little hesitant to do things that center on teaching users that these things are safe to interact with. (Then again, users seem happy to fill out things like https://twitter.com/davidbaron/status/898389439101018113 that are completely unsafe.)
I'm skeptical of users' ability to make the sometimes-subtle distinctions needed here as to what is "connected" to the browser UI and what is entirely within the page.
I agree but that is not a new issue and this is an opportunity to make the differences less subtle. We already have pop-ups and they are already used for use cases like SSO. As you point out in the link worse solutions are also already widely used for payments.
This use case (login to your bank to authorize a payment/share data) is only going to grow in prevalence. It's the de-facto flow for Open Banking-based payments which are rolling out across the EU and growing in popularity elsewhere: https://www.bis.org/bcbs/publ/d486.pdf
Also, Secure Remote Commerce (SRC), the card industry's initiative to improve online payments is going to move to a profile/wallet based model requiring users to authenticate prior to making a payment.
All of these use case are moving from user supplied credentials on the RP site to cross-origin systems where the user is directed to an AS origin to authenticate and authorize a payment/login/other.
E.g. User will not provide a card number and CVV to the merchant, the merchant will direct the user to their digital wallet where they authorize the payment and be redirected back
E.g. User will not provide a username and password to the website, the website will direct the user to their IdP where they will authroize the sharing of their email address etc. and be redirected back.
These are now the de-facto auth flows for the web but the platform doesn't have the appropriate primitives to allow websites to do this securely for their users.
I see this as an opportunity to REPLACE what we have today (pop-ups) with something that is designed explicitly to solve for these cross-origin use cases AND in doing so consider challenges like phishing and window spam that were likely not considered appropriately when adding features like window.open
.
So I think there are two things that could make me more comfortable here (and I'm aware the second one is a substantial amount of work):
If this were a concept that specific specs (like Payment Handler) could use as a shared mechanism, but not a general purpose API that any website could use. (@alice will write more about this shortly.)
Demonstrating that it's possible to build a UI that is simultaneously (a) a polished and shippable quality UI and (b) where a reasonable portion of users are able to make the correct security distinctions based on it (compared to the best-possible fakes of it). (I think a "reasonable portion" probably at least means a similar portion to the Web's existing security indicators... although it would probably be desirable to do better! It probably also means including users using various accessibility mechanisms.)
[...] a concept that specific specs (like Payment Handler) could use as a shared mechanism, but not a general purpose API that any website could use.
For example, it could be a concept like "browser tab", which refers to something created by the UA, but which cannot be created directly by the web page.
It might be worth fleshing out some pre-requisites for an API to be able to use this mechanism, e.g. that the API must be able to create a list of trusted providers.
Thanks @dbaron and @alice for the feedback!
- If this were a concept that specific spec (like Payment Handler) could use as a shared mechanism, but not a general purpose API that any website could use. (@alice will write more about this shortly.)
This seems reasonable. @agektmr has made a similar suggestion elsewhere and I think it may be an effective tactic for us to validate the benefit of this proposal while buying some time to better understand the security / privacy / UX threats so they may addressed. I think a critical next step to check whether Modal Window API has legs is to find some concrete use cases, whether standalone or as part of API, where this proposal is clearly better than alternative solutions. It seems that a likely area may be cross-origin service coordination, but more investigation is needed to clearly articulate the user problems that are being solved.
- Demonstrating that it's possible to build a UI that is simultaneously (a) a polished and shippable quality UI and (b) where a reasonable portion of users are able to make the correct security distinctions based on it (compared to the best-possible fakes of it). (I think a "reasonable portion" probably at least means a similar portion to the Web's existing security indicators... although it would probably be desirable to do better! It probably also means including users using various accessibility mechanisms.)
These are very fair questions. They are similar to what we want to answer for Payment Handler API as well. Chrome is planning to conduct a UXR in the near future to answer these questions, but in the context of the current payment handler flow. We’ll share any transferable learnings on this thread.
[...] a concept that specific specs (like Payment Handler) could use as a shared mechanism, but not a general purpose API that any website could use.
For example, it could be a concept like "browser tab", which refers to something created by the UA, but which cannot be created directly by the web page.
It might be worth fleshing out some pre-requisites for an API to be able to use this mechanism, e.g. that the API must be able to create a list of trusted providers.
These make sense. Out of curiosity, @alice, did you have any APIs in mind that could benefit from such a "browser tab" concept, if we flesh out the prerequisites?
The Credentials Community Group is keeping an eye on this thread for potential use of a modal window API/feature for the Credential Handler API, a work item of that group.
A video of a demo of that API is here, polyfill code here, and a simple demo that includes a polyfilled UI (like the Payment Handler UI) that we would prefer to become a more phishing-safe modal window is here.
@danyao wrote:
Chrome is planning to conduct a UXR in the near future to answer these questions, but in the context of the current payment handler flow. We’ll share any transferable learnings on this thread.
Looking forward to it!
@danyao
I think a critical next step to check whether Modal Window API has legs is to find some concrete use cases, whether standalone or as part of API, where this proposal is clearly better than alternative solutions.
That sounds like a very promising avenue, and would certainly help provide context for reasoning about the security, privacy, and other UX implications.
It seems that a likely area may be cross-origin service coordination, but more investigation is needed to clearly articulate the user problems that are being solved.
Once again, completely agree!
... did you have any APIs in mind that could benefit from such a "browser tab" concept, if we flesh out the prerequisites?
Not specifically, but I think your point about cross-origin service coordination is extremely salient, and it would be worth looking for APIs covering the use case examples given in the explainer - e.g. sharing and authentication.
It seems like you have plenty of work to go on with, and we probably won't have a lot of feedback to offer until that's done, so we're going to close this for now.
Please comment here or file a new review linking back to this one when you're ready for another look.
Thank you for so thoughtfully considering the feedback we've had so far, and we look forward to more collaboration in future!
こんにちはTAG!
I'm requesting a TAG review of:
Further details:
We'd prefer the TAG provide feedback as: