Allow Browsing Contexts to maintain opener member across Browsing Context Groups.

hemeryar commented 2 years ago

We worked out in #6364 what became the COOP:Popups proposal. Implementing that requires preserving limited scripting capabilities between Browsing Context Groups to avoid overwhelming the agent cluster keying with many new members (like top level origin, window policy, etc.)

The following summary looks at how we'd achieve that:

Acronyms BC: Browsing Context BCG: Browsing Context Group COOP: Cross-Origin-Opener-Policy

Related Browsing Context Groups What makes a BCG hold all the possible scripting links is simply the fact that creating a new BCG does not update the new BC’s “opener BC” member. This is only done when “creating an auxiliary browsing context”. Therefore the opener getter returns null. There is no fundamental restriction to cross-BCG scripting other than that. From the window.open caller side, we do not get a reference due to the window type “no opener” in the open algorithm step 13, or the immediately following navigation step that later swaps BCG.

Given that there is no fundamental limitation, an idea would be to have “connected” BCGs, for which we would keep the opener/openee. When calling a WindowProxy getter we would only allow complete access for BC’s in the same BCG, and only a subset otherwise.

What we need for that:

Add an “opener” parameter to “Create a new BCG”. If specified, we plug it into the newly created BC’s “opener BC” member.
Modify the “obtain a BC for navigation response” algorithm. Any new means of connecting BCGs would arrive here with some sort of parameter saying that we want to keep an opener. Then we simply plug it in the creation of the new top level BrowsingContext.
Update WindowProxy get and set, to verify that the browsing contexts accessing the properties are same-BCG, otherwise restrict to a list of authorized properties. Not super familiar with this part but probably something similar to COOP reporting checks (step 2).

Key spec links BCG definition Create a new BCG Obtain a BC for a navigation response Choose a BC by name (what window.open uses)

Quick audit of BCG references https://docs.google.com/document/d/1tQihYjvkp9IqztlvHn-wYbiiSW8R04-ZP5RPlPgWBkM/edit#

annevk commented 2 years ago

To stress, the idea here is that this is the only "direct" relationship these BCGs would have. Named targeting, agent clusters, and other things scoped to a BCG would continue to be scoped to it. (Of course, if unrelated BCGs host same-origin documents those could communicate through storage-key-based APIs and that would not change.) Additionally, a BCG with this policy would have exactly one TLBC for its lifetime. All created popups result in a new BCG, potentially with a "direct" relationship.

(I personally rather like the idea of making this Cross-Origin-Opener-Policy: popup. With the value meaning that you can act as a popup for someone else, but also get to open popups yourself. And any incoming and outgoing "direct" relationships would be limited to closed and postMessage() communication.)

Edit: note that if we have to swap a BCG with opener, we have to move that opener to the new BCG (unless COOP prevents it). Otherwise it likely wouldn't be viable for a popup service to adopt this policy since it might have to redirect to other sites that haven't yet migrated.

hemeryar commented 2 years ago

Update on this, we're going to go for the COOP:Popups variant. I have drafted an explainer and will begin working on a spec PR. Updated the main post.

annevk commented 2 years ago

Hey @hemeryar, thanks for putting a bunch of work into this! I had a rather open-ended discussion with @mystor and @smaug--- about implementation feasibility and potential alternative approaches which I'd like to summarize here and get your perspective on.

One thing that came back a few times was if there is a way to solve this without relying on WindowProxy objects. For that it would be helpful to better understand what popup endpoints are relying upon today and what code they could conveivably change. E.g., are they in control of code that pops them up? If they go through a chain of several origins, do the other origins use self.opener.postMessage(), self.opener.closed, and self.onmessage, or is communication predominantly through URL parameters or some such? Who controls that code in the other origins?

Now apart from that, a scenario @mystor brought up that deserves further scrutiny is this:

site.example embeds embed.example and popups auxiliary.example.
auxiliary.example has the policy and thus creates a new browsing context group. auxiliary.example also embeds embed.example.
auxiliary.example navigates to embed.example, which does not have the policy.

We end up with a situation where neither site.example nor (popup) embed.example have the policy, but they are also in separate agents. And thus site.example's embed.example and popup embed.example do not have synchronous script access to each other, whereas auxiliary.example adopted the policy they would have had that. Is that desirable?

The alternative solution here is what we discussed earlier (and I argued against as it's also messy, but I didn't consider the above scenario), in that instead we introduce a boolean on the top-level browsing context which when true indicates that agent cluster lookup happens on a different map bound to that top-level browsing context for the duration of the boolean being true. That boolean would also be checked by WindowProxy and Location members to suitably restrict them (as well as discard incoming proxied messages). And it would be checked by named targeting which would bypass such top-level browsing contexts. With this kind of setup only embed.example when embedded through auxiliary.example would end up in its own agent. The remaining two (embed.example embedded by site.example and popup embed.example) would end up in the same agent and have script access.

As I understand it this would be "simpler" to achieve, but it also would be more web compatible as only sites that opt-in notice the impact from the policy change.

hemeryar commented 2 years ago

Thanks for the comments @annevk ! I'll delegate to @ddworken who probably has a better idea of how Oauth flows are used in practice on the first topic.

On the second, let's call that having "dynamic clustering", that is indeed the intended behavior. I think it is consistent with other COOP values, where for example the opener is not restored if you have:

Initial page A
Opens a popup to A + COOP: same-origin
The popup navigates to A Similarly redirects also explicitly consider COOP. I am not sure which practical cases would rely on such behavior.

Regarding the proposed solution, we'd get "dynamic clustering" for unsafe-none pages only, and not policy setting pages. For example:

Initial page A with COOP: Popups
Opens a popup to A
This popup navigates to A with COOP: Popups The initial page and the popup have the boolean set to isolate clustering, when they could actually be in the same agent. That might end up being surprising.

I think both have their logic, although I still think since we're going for a COOP policy, keeping the same type of behavior makes sense.

Let me know what you think!

ddworken commented 2 years ago

re: dynamic clustering: I agree with @hemeryar. While the behavior is a bit confusing, I agree it is similar to the general trickiness around redirects with COOP which is something we've generally found to be surmountable when rolling out COOP. And at least to me, it would be surprising if this new COOP value behaved differently in this respect.

For that it would be helpful to better understand what popup endpoints are relying upon today and what code they could conveivably change. E.g., are they in control of code that pops them up? If they go through a chain of several origins, do the other origins use self.opener.postMessage(), self.opener.closed, and self.onmessage, or is communication predominantly through URL parameters or some such? Who controls that code in the other origins?

Speaking very broadly, I know of 3 common use cases for interacting with popups:

Auth popups (sign-in-with-x and oauth)
- In some cases, the auth provider does control the code that opens the popup. For example, Sign In With Google is done via a JS library that Google provides. In my experience, these libraries tend to support receiving credentials via either a callback (which is implemented with postMessage) or via query parameters (which is implemented via redirecting to an endpoint that the integrating site has to create). So while Google does control the library and can change it, it doesn't have full control since it has to support the existing way of receiving credentials in a client-side callback[^1].
- In other cases, the auth provider does not control the code. For example, since oauth is standardized, many oauth integrations do not use the auth provider's library on the client-side. The good news is that most auth flows do not support postMessage, but at least Google and CloudKit support postMessage oauth flows. These would be very difficult to change (basically requiring the auth provider to reach out to every customer using that flow to ask them to migrate).
Payment popups
- For example, Paypal's checkout integration involves opening a popup. From what I can tell, these tend to use a library controlled by the payments provider so the provider could change the library[^1]. I believe that popup based 3DS flows also fall into this category.
Opening another page without losing the current one
- Oftentimes sites open a cross-origin popup to a link that the user clicks on. This allows the opener to not lose any of its current state and for it to react once the popup has been closed. In this case the openee has no control over this interaction, and is often left in the tricky position of deciding whether this is something they intend to support or not. This is a surprisingly common pattern I've run into with COOP deployments at Google (oftentimes even with one Google product opening a popup to another cross-origin Google product).

So to summarize:

In some (but not all) cases, the openee does control the JS library
In the cases where the openee controls the JS library, they could in theory change it, but it would be very difficult to do so
In the cases where the openee doesn't control the JS that is opening them, it is even harder

[^1]: Conceivably, the provider could change the library to avoid using postMessage, but it would be a lot more complex. Specifically they could have the library provide the popup with a unique ID when it is opened. The popup could then go through the auth/payments flow and send an HTTP request with the data and the unique ID to an endpoint the provider controls. The JS running on the integrators site could then continuously poll to see if the popup is done yet. They would also have to add a timeout/heartbeat of some kind to replace the closed attribute. But this is a pretty complex change that would be very hard to get right at scale (e.g. what about CSP policies blocking requests?).

annevk commented 2 years ago

@hemeryar I'm a bit confused by your reply. Perhaps we should try to talk it through in person again.

The initial page and the popup have the boolean set to isolate clustering, when they could actually be in the same agent. That might end up being surprising.

I thought that was the plan regardless? That this new policy would always "isolate" you so popups could always run in parallel, even if everything in the popup was the same as you (origin, policy, ...).

Also, I thought a problem with COOP for popups was that it breaks these opener relationships. This would allow preserving them better, no? In particular, if your authentication flow involves multiple parties as it might with corporate customers, not all of those parties might have adopted the policy and getting them to all adopt the policy at once would be hard.

@ddworken thanks for the context. That does suggest to me we're stuck with WindowProxy in some fashion. Though maybe @smaug---- and @mystor have other thoughts. For scenario 3, the opener essentially polls openeeWindowProxy.closed?

hemeryar commented 2 years ago

Hi @annevk , I'm open to discussing in person again if that gets too confused here :)

About the first point, I thought the initial plan was to key on origin, top-level origin, coi and policy value. So if two pages were same-origin AND same COOP:Popups, they could theoretically communicate synchronously without issue. I think we weren't on the same page here.

If we go for the isolation boolean, we go for something working quite differently from previous COOP values. In particular a website would more or less lose the ability to communicate same-origin with a popup with anything else than postMessage(), because if either the opener or the openee has COOP: Popups we lose that capability. So that could be confusing in that sense.

Also, I thought a problem with COOP for popups was that it breaks these opener relationships. This would allow preserving them better, no? In particular, if your authentication flow involves multiple parties as it might with corporate customers, not all of those parties might have adopted the policy and getting them to all adopt the policy at once would be hard.

An advantage with this policy is that since it only restricts the opener, as long as the interactions happen via postMessage it will be fine. We don't have something like COOP: Same-origin in the middle completely breaking the link and ruining it for further navigations.

annevk commented 2 years ago

@hemeryar I don't really see how you thought that would be possible given they would have different BCGs (and thus different agents). That would be more of an option with a single BCG + TLBC flag, but I'd prefer giving the browser more flexibility in terms of process allocation if possible.

hemeryar commented 2 years ago

@annevk Yes I'm talking about the initial solution with everything in a single BCG and a WindowPolicy. We discussed back then having the same agent cluster for pages with similar policies and origin.

Regarding how easy the BCG opener vs the BC boolean would be to implement in Chrome, I'm not entirely sure. I've reached out to the Security Architecture team to discuss. So to sum up:

Option A, opener across BCG:

Spec is likely simpler, we do not need to change targeting, or any same-origin check. Already a PR.
In line with previous COOP values. Consistent behavior.
Redirect/navigation chains impact the final page's BCG.

Option B, an "isolate" boolean on BC:

Redirect/navigation chains that have the new COOP value do not impact the final page's isolation.
Spec is more complex, need to explicitly exclude BC's with the boolean from some operations. Same places as the original spreadsheet audit with the origin checks.
Behavior is inconsistent with other COOP values.

annevk commented 2 years ago

@hemeryar for A we do need to change IsPlatformObjectSameOrigin, right? To account for "same agent". That previously was not a possible scenario, but now it somewhat is. (In a more ideal setup we'd more fully explain the objects and thus you couldn't end up with this weird corner case, but we don't have that.)

I also don't understand what you mean with "(in)consistent with other COOP values". Those would break opener relationships across origin boundaries. Presumably that is not a thing we want here? E.g., when I use Google Accounts for my corporate account I end up using at least one cross-site-non-Google-controlled domain. Presumably it would be bad if that ended up breaking the relationship.

hemeryar commented 2 years ago

@annevk Regarding how to achieve the actual restriction of properties for A, in the PR I simply added a line in 7.4.7 [[Get]] ( P, Receiver ) and 7.4.8 [[Set]] ( P, V, Receiver ), verifying that we are in the same BCG. I think that's enough.

About the consistency between COOP values, what I meant is that today, setting COOP on any Oauth does break the link forever as well, and COOP: Popups would do the same. On the other hand COOP was not designed to be put on popups in the first place. So I guess it could make sense to have a different behavior here. If you think that would be a big blocker to deployment in the wild, I'm happy to change the spec to have the BC boolean implementation instead :)

I think one important thing to discuss is whether it would be expected to have two same-origin same-COOP page and popup not be able to have full access to each other. Maybe coming back to having another policy would be the better choice in that case? Doing exactly what's been described here, but having a different name to make it explicit that it behaves differently from other COOP values.

hemeryar commented 2 years ago

A bunch of discussions happened offline that I want to sum up:

Anne's mention of navigations in the popup breaking openers for other origins (A opens B, B navigates to C with COOP and back to B. A and B only have restricted opener access) was initially regarded as a minor by Google folks, but we missed that platforms like auth0 rely on that behavior and would have to reach out to every client. That is not acceptable.
We're exploring other options, in particular Anne's proposal of having agent cluster partitioning for COOP: Popups pages.

In any case, I think we can close this particular proposal as it is not relevant anymore.

smaug---- commented 2 years ago

( Somewhat related to this discussion is something I discussed with annevk and mystor about openerPort which would be preserved through new page loads. One could pass a MessagePort to window.open, yet keep using noopener. The opened window would have openerPort property for communication with the opener. The two windows would still live in separate BCG and could use whatever COOP they want. That would be a new thing and require minor opt-in (communication through openerPort and not opener) )

whatwg / html

Allow Browsing Contexts to maintain opener member across Browsing Context Groups. #7713