Consider allowing `document.domain = null`

whatwg / html

HTML Standard

https://html.spec.whatwg.org/multipage/

Other

7.88k stars 2.58k forks source link

Consider allowing `document.domain = null` #2757

Closed mikewest closed 2 years ago

mikewest commented 7 years ago

Folks generally use document.domain to relax the same-origin policy, allowing multiple documents in the same eTLD+1 to reach into each other's DOM and muck around. It might be reasonable to use the same mechanism to harden the policy, disallowing access even when it would otherwise be granted.

That is, something like document.domain = null could cause a document's origin to fail all "same origin-domain" checks that relied upon the domain check in step 2.1 of that algorithm.

I'm prototyping this in https://chromium-review.googlesource.com/c/535536/ to let some folks on Google's ads team play around with it to see if they can harden some of DoubleClick's ad frames. It might also be useful for other projects that run un- or quasi-trusted user content on shared domains.

WDYT?

/cc @mozfreddyb @ckerschb @dveditz / @johnwilander / @travisleithead

mozfreddyb commented 7 years ago

I understand why allowing website to move themselves to a unique origin is useful to you. But this without a reload seems brittle and error prone. To ensure proper isolation, you'd have to freeze the world and rewire all the things like document.cookie, IndexedDB etc., wouldn't you?

I also wonder what @bzbarsky and @annevk have to say about this.

mikewest commented 7 years ago

I understand why allowing website to move themselves to a unique origin is useful to you.

Moving pages into a unique origin is not what I'm suggesting; we can do that very well today with Content-Security-Policy: sandbox. I'm suggesting instead that we should allow documents to set their domain attribute to something that can't match any other document's domain.

But this without a reload seems brittle and error prone.

document.domain already produces a runtime check; Firefox is actually much better at processing its behavior than Chrome/WebKit due to the membrane architecture you're using for DOM access checks.

I agree with the implication that we wouldn't have introduced document.domain to the platform if we were designing it today. Since it already exists, allowing it to be used as a hardening tool (instead of its status-quo role as a weakening mechanism) seems like a reasonable extension.

To ensure proper isolation, you'd have to freeze the world and rewire all the things like document.cookie, IndexedDB etc., wouldn't you?

Yes. For the specific use cases I'm targeting, these aren't important, but I grant you that complete isolation would require significantly more work than the change proposed here. That's what @estark37 is working towards in https://wicg.github.io/isolation/.

This change is significantly smaller than that one will end up being, and does not encompass nearly the same set of threats. It blocks DOM access between same-origin resources that want to distrust each other. That seems like a reasonable primitive to provide.

mozfreddyb commented 7 years ago

OK, to clarify, you want to break same-originness with other webpages on the same origin but the origin should not really be unique? You want an origin that sort-of serializes to scheme://null:port which will be shared with...whom?

You said this will be DOM only, so fetches from the page towards its (previous) origin are supposed to remain same-origin? The line we try to draw seems a bit fuzzy to me. I understand you want the linked same origin-domain check to fail and everything that does a same origin check to succeed? Maybe we can explore the use case in a greater detail to get to the core of it.

mikewest commented 7 years ago

Maybe we can explore the use case in a greater detail to get to the core of it.

Here's a concrete case in which this would be helpful:

A embeds two frames from B, both of which include content that's not as trustworthy as it ought to be. One of the frames from B turns out to be malicious, oh noes! Today, it can takeover any other frames from B by walking the tree exposed through window.top.frames. It would be nice if B could make that more difficult by forcing the same origin-domain checks that enable DOM access, navigation, etc. to fail by executing document.domain = null; before loading in untrusted content.

The nice thing about building on top of document.domain is that it doesn't change the actual origin of the page. It still has storage and cookies (unlike sandbox), and it can still make requests to the web which have a non-opaque origin for the purposes of CORS.

bzbarsky commented 7 years ago

It doesn't sound like the proposal is to change anything about "same origin" checks, just "same origin-domain" ones, right? In particular, the serialization of the origin (which is not affected by document.domain right now) would not change in any way. This seems fairly simple to implement, fwiw.

That said, the protection it would give you would be pretty limited. Looking at the use cases listed in https://github.com/whatwg/html/issues/2757#issuecomment-308436651:

that enable DOM access

This part would in fact work. This is the strongest argument for doing this. That said, this is very much limited to takeovers of already-existing pages. The thing that did document.domain = null could still do XHR to any URI from B and exfiltrate arbitrary data. It would have full access to storage and cookies. It could also do a blob: load and it's not entirely clear to me what that actually does in UAs in terms of origins (i.e. whether the blob: is same origin-domain with the thing that loaded it or with other pages of B or what). The spec is worse than useless on that point; I filed https://github.com/whatwg/html/issues/2759 on that.

navigation

All the navigation checks are for "same origin", not "same origin-domain", so not affected by this.

etc.

I'd really like to know whether there are other things we're trying to protect from here.

Basically, implementation complexity is low, spec complexity is moderate (and given the trouble the spec has had with actually defining origins I'm very wary of adding complexity in that definition). Benefit seems pretty low to me so far, but I'm willing to be convinced that it's larger than it seems. It would just be unfortunate to add the complexity without gaining further security. And note that anything that allows a page to use a string to create another page that is same origin with it but not same origin-domain (i.e. acts like document.domain is not set) would effectively make the security benefit completely nonexistent. So we have the additional spec complexity of ensuring that nothing allows that, now or in the future. :(

domenic commented 7 years ago

Although I think addressing @bzbarsky's questions about expounding on the benefit/cost ratio here are more important, I want to push on the stack my initial reaction, which is that it seems better to tie this to other features (sandboxing? feature policy?) instead of using document.domain.

But let's leave that discussion until after we've agreed, or not, that this is a problem worth solving.

mikewest commented 7 years ago

Thanks for taking a look, @mozfreddyb, @bzbarsky, and @domenic.

That said, this is very much limited to takeovers of already-existing pages.

I agree that this is a limitation. However, the ecosystem has developed a few (anti-?)patterns that fit this model pretty well. Consider A and B again. Let's say A is cnn.com, and B is one of the many static, highly-cacheable "SafeFrame" variants (like http://tpc.googlesyndication.com/safeframe/1-0-9/html/container.html) that might be loaded as a frame a few times on a page to render distinct ads by dumping its window.name into itself.

(Put aside for the moment the question you're surely asking yourself about whether this is a sane mechanism. It's a widely used IAB standard, for good or for ill)

So. A embeds B1, B2, and B3. Each of these frames has access to each of the others. This is bad: B3 can takeover B1 and B2, generating clicks, reading ad content, rendering new content, navigating to new content, changing landing pages, etc.

"Sandbox!", you say, rightly. That can work in many scenarios, thanks to allow-popups-to-escape-sandbox! Unfortunately, some of these ads use plugins. Boo. And other ads want storage. Boo. And other ads want to make CORS requests. Boo.

"Subdomains!", you say, also rightly. This would have some negative performance impact, as calculating the correct suborigin for a given ad takes some time, as does the fact that clients would move from a cache hit to an additional network request for each "new" container.html.

We could do it via feature policy, I suppose (@cramforce suggested a particularly verbose option). That seems like more work for the ~same result, but it might be worth exploring.

We're also looking at addressing these concerns through Suborigins, which is also more work for a result that might or might not actually be what folks want here.

That's a long way of saying that we could design something more targeted, but document.domain already exists, seems to fit the use case, and, as @bzbarsky notes, the implementation side of this change is pretty simple.

All the navigation checks are for "same origin", not "same origin-domain", so not affected by this.

Hrm. You're right about the spec, which is unfortunate, because both Chrome and Safari block navigations if domain mismatches. See the test at https://github.com/w3c/web-platform-tests/pull/6243. That said, the behavior is all over the map. Chrome throws a SecurityError for the assignment to location.href, Safari doesn't throw, but logs a console error and blocks the navigation. Firefox navigates. I don't have an Edge installation at home, but perhaps someone can poke at it.

Perhaps we can agree on adopting that behavior? :)

So we have the additional spec complexity of ensuring that nothing allows that, now or in the future.

I think we should have that complexity today; document.domain already changes the set of other documents that a given document can talk to. Setting that property denies access to any document that hasn't set the property, for instance.

bzbarsky commented 7 years ago

For the Edge behavior, they had no checks at all on navigation (which is actually what the spec says, btw!) as of when https://github.com/w3c/html/issues/780 was filed. They were going to add some, but it wasn't clear to me exactly what. The lack of spec action on that part is pretty discouraging, by the way: we should really have it as a higher priority to fix security bugs in specs.

I can test a link you give me in Edge if you want, but w3c/web-platform-tests#6243 doesn't seem to have any links to where a live test lives. :(

document.domain already changes the set of other documents that a given document can talk to

For a very restricted meaning of "talk to". And in general it does not prevent you from creating things you control which can talk to those documents (but can't talk to you). Or at least I cannot prove that it prevents that. That was never part of the threat model, because it was always assumed that you were hostile you would simply do whatever you wanted to those other documents before setting document.domain. The new complication is that now the setter of document.domain and the hostile code are actually the same document.

But for your new "take over" use case you want to be able to prove exactly that.

mikewest commented 7 years ago

For the Edge behavior, they had no checks at all on navigation (which is actually what the spec says, btw!) as of when w3c/html#780 was filed.

Ah. I agree that this seems like something we should change. I'll comment on that bug.

document.domain already changes the set of other documents that a given document can talk to

For a very restricted meaning of "talk to".

Restricted in a way that seems to fit the contours of a class of problem, as I hope the use case above demonstrates. :)

And in general it does not prevent you from creating things you control which can talk to those documents (but can't talk to you). Or at least I cannot prove that it prevents that.

I'd be interested in walking through the things you're concerned about. I haven't validated any of this against the spec, but my expectation is that anywhere we inherit the origin of a document, we ought to inherit the entire origin, including domain. I can believe that that's not currently the case (as you noted above with the questions about blob:, but it seems like that ought to be the expectation we aim towards, regardless of whether we implement the feature we're discussing here.

But for your new "take over" use case you want to be able to prove exactly that.

I'd also note that the particular use case I'm thinking about would involve setting document.domain = null (however we end up spelling it) on both the victim and attacker frames as part of the loader script, which seems relatively robust against the potential permission leakage you're pointing to. Even if the malicious frame can generate a context that doesn't have domain set, it would be prevented from accessing the victim because the victim set domain.

bzbarsky commented 7 years ago

Restricted in a way that seems to fit the contours of a class of problem

I'm still not 100% convinced that the restrictions here are sufficient to actually close up all the gaps. And no, I don't have a specific attack scenario in mind so far; I just feel that the API surface involved is huge and I have seen no credible attempt to prove that the restrictions are enough.

In particular, an analysis of all the places in platform that check "same origin" vs "same origin-domain" and why they are not problems here would seem like an important thing to do before we decide that we're offering safety guarantees.

my expectation is that anywhere we inherit the origin of a document, we ought to inherit the entire origin, including domain.

That is my expectation too, but definitely not the case in the spec.

And again, anaysis of all the cases that rely solely on "same origin" is needed.

on both the victim and attacker frames

OK. So the safety guarantees from this feature are pretty carefully tailored to this specific use case...

One other thing (and this just occurred to me, sorry; shows you how easy it is to get tied up in implementation details and lose sight of the big picture). The spec and implementations have actively been removing cases that check for "same origin-domain" in favor of using "same origin" as much as possible, with an explicit goal of deprecating and eventually removing document.domain. Both have been willing to remove existing access barriers, or introduce new access restrictions, to achieve this goal. We have been adding APIs to allow more cross-site communication to remove the need to use document.domain for that purpose, so as to give sites a migration path off document.domain.

If we now start using document.domain and its underlying "same origin-domain" concept as a way to add new security primitives that pages are encouraged/expected to use, that seems like a significant step back in that process.

mikewest commented 7 years ago

I appreciate the feedback, and the time you've spend putting together responses. Thanks!

Before diving back into the line-by-line, I'm waiting for feedback from the folks at Google that are interested in addressing the problem I spelled out above. If it turns out that this mechanism doesn't actually solve their problem, then we can save ourselves some back-and-forth. If it seems effective, then I hope you'll indulge a bit more discussion on the details. :)

bzbarsky commented 7 years ago

So I've been thinking about the original use case described in https://github.com/whatwg/html/issues/2757#issuecomment-308436651 and I'm not sure why B can't address this entirely server-side by using different subdomains for differently trusted things. Cookies should work OK in that situation, iirc; is the issue separated storage?

annevk commented 2 years ago

Closing this as since this was raised we've added various ways to mitigate the impact of document.domain through HTTP headers. Feel free to comment/reopen if that assessment is wrong, though I suspect at this point a fresh issue with a revised proposal would be best.