Need to define how 'src' works with sandboxed frames

clelland commented 6 years ago

Branched from #227 -- 'src' is a reference to the declared origin of the frame, but this is poorly (or not) defined for sandboxed frames.

What I would like to say for that case is that 'src' in the container policy only matches the initial opaque origin of the document that is loaded into the frame.

If the frame navigates itself by setting document.location, then it has a new opaque origin, which will no longer match. (Ideally, if it navigates back, then the original opaque origin is restored, but I don't know if that's possible.)
If the parent document navigates the frame by setting it's src attribute, then the container policy will match the new opaque origin given to the newly-loaded document.

The goal here is that 'src' matches the parent document's intention for the origin of the page it expects to have been initially loaded into a frame. When the frame navigates away from that origin, 'src' should no longer match.

This leaves open the question of how to handle a srcdoc' attribute in a consistent manner.

bzbarsky commented 6 years ago

@bakulf @annevk

bzbarsky commented 6 years ago

@bakulf @annevk

annevk commented 6 years ago

In theory you should be able to preserve identity of an opaque origin, but I think thus far we've only done this when multiple objects have a reference to the same one. I.e., it hasn't survived the lifetime of the objects holding a reference to it. If I'm correct, I'd double check with all implementers to see about changing that invariant.

srcdoc presumably just takes the origin of the iframe element's node document when not sandboxed?

clelland commented 6 years ago

srcdoc presumably just takes the origin of the iframe element's node document when not sandboxed?

I think that's correct -- it seems to follow naturally from https://html.spec.whatwg.org/#origin:an-iframe-srcdoc-document

clelland commented 6 years ago

I've committed #231, which should address this now. The declared origin, used by 'src', should match the parent's expectation for the initially loaded document, even with sandboxing and srcdoc.

bakulf commented 6 years ago

The current declared origin algorithm seems to behave wrongly for sandboxed iframes. Here an example:

<iframe src="http://example.org" sandbox="allow-scripts" allow="fullscreen" />

For declared origin 2. we create an unique opaque origin A.
'fullscreen' policy means: 'fullscreen src', which means that we enable fullscreen for origin A.
iframe content is loaded and it has a unique origin B for https://html.spec.whatwg.org/#sandboxed-origin-browsing-context-flag

result: the iframe will not be able to use fullscreen because origin A doesn't match origin B. I don't think this is what we want here.

@clelland, feedback? Thanks

clelland commented 6 years ago

You mean that the unique origin returned in 2 doesn't match the actual unique origin for the document as created... You're right, and that's a problem. It doesn't matter for the JS API, I think, but by using this in Process feature policy attributes it doesn't work there.

The intention in that case is that it matches the actual opaque origin initially generated for the framed document (and not for the initial about:blank document, if we make one of those first in the sandbox case) -- I'm not certain how to best write that here, though.

bzbarsky commented 6 years ago

I suspect the right answer might be that 'src' should not necessarily be defined in terms of origins...

annevk commented 5 years ago

@clelland could you please point to the code that makes this work in Chrome? That'd help figuring out a model here, assuming it's somewhat correct. Perhaps it's as simple as @bzbarsky says and rather than comparing the origin of the global, the global's URL's origin is compared to the origin of the src attribute?

bzbarsky commented 5 years ago

That wasn't quite what I was proposing. I think the semantics we want is "src means same-origin with the thing that would have been loaded via processing the iframe attributes".

That means that if the document has a nonce origin it would match src only if it was loaded via processing the iframe attributes. If it has a URL origin then we can compare that URL to a URL that we derive from the "src" and "srcdoc" attributes as follows: If "srcdoc" is set, get URL from origin of parent document (if it has a URL origin), otherwise get URL from "src" attribute.

clelland commented 5 years ago

I'm seeing if I can define a "first loaded origin" concept in HTML, from the steps in Process a navigate fetch that can be used for this. Essentially storing, on the browsing context, the origin of the document which was initially loaded into it. Then we could use that as the origin to compare to, iff the document has an opaque origin and we want to see if it matches src.

bzbarsky commented 5 years ago

I don't think the concept of "first" captures what we want. If I have:

 <iframe sandbox allow="fullscreen" src="http://foo.com">

and then I do setAttribute("src", "http://bar.com") then I think the bar.com page should also be allowed to do fullscreen.

clelland commented 5 years ago

Absolutely agreed -- processing the iframe attributes should update the origin that we compare to. Navigations initiated from within the nested document should not.

mikewest commented 5 years ago

@andypaicu's work on CSP's 'self' and sandboxes might be relevant here: https://github.com/w3c/webappsec-csp/pull/362.

mikewest commented 5 years ago

On second look, this is discussing 'src', not 'self'. So. Ignore me. :)

clelland commented 5 years ago

So, what I'rd really like to do here is something like this:

In HTML:

Every browsing context should have an 'initial opaque origin', which is an origin or null.
"Process the iframe attributes" should reset this to null when called.
After "initialize the sandboxing for document", if the document's origin is opaque, and the browsing context's initial opaque origin is null, then set the initial opaque origin to the document's origin.

In Feature Policy:

"Process feature policy attributes" should use the browsing context's initial opaque origin as target origin when calling "Parse policy directive". That will make 'src' expand to that opaque origin, if encountered in a declaration.

@bzbarsky, do you think something like that would work?

clelland commented 5 years ago

I just uploaded https://github.com/whatwg/html/pull/4766 as a concrete suggestion for the HTML side.

annevk commented 5 years ago

I've been thinking about this some more and I think I'd actually vastly prefer we drastically simplify the allow attribute processing model (thanks to @triblondon). In particular, I don't think that A delegating a feature to B, but then not trusting B with navigation is a reasonable threat model. If B wants to leak data it'll have a lot of ways of doing so. B inadvertently being navigated to a random party is highly unlikely and more indicative of a bug than things working as intended.

The prior model of the allowfullscreen attribute & friends seems much more reasonable in comparison and I suggest we align with that. To keep compatibility we could split on ; and then split on whitespace and only look at the first token of that second split for comparison purposes. I think that's still a reasonable change to make at this stage given that deployment is not so widespread and it would reduce a lot of the complexity.

(I can put this comment in a new issue if that would be preferable.)

(The current model could be thought of as a far less capable variant of https://www.w3.org/TR/2015/WD-COWL-20151015/, which is what you'd really need if you want to do this kind of distrust-but-trust-with-some-capability properly I think.)

bzbarsky commented 5 years ago

B inadvertently being navigated to a random party is highly unlikely

Seems like this is the normal thing that would happen if B has links on their page, no?

clelland commented 5 years ago

allowfullscreen is currently shorthand for allow="fullscreen *" -- are you suggesting that we make that the only possible model with the allow attribute? I'm not convinced that's a great idea; though we might be able to make it the default.

I agree that there are many ways for a malicious B to exfiltrate information if it has access to a feature. That's why the delegation model explicitly allows B to further use the allow attribute to delegate the feature to any party that it trusts. If it wanted, it could even simulate navigation by creating an iframe that filled its content area and loading the target there.

If * is the only option, though, then you can't have outgoing links on any frame without also granting the feature to those pages. B not only has to not do anything malicious itself; it also has to ensure that it doesn't link anywhere off of it's own origin, even transitively. It can't host an 'about' link that leads to a 'contact' link that leads offsite, for instance. The bar for being able to trust B is much higher.

(Yeah, that should probably be a new issue in any case)

annevk commented 5 years ago

I'd love to see an example, as I would expect any such links to open in popups in actual products that deal with this kind of thing.

annevk commented 5 years ago

I filed #337 on this. It'd be great if we could iterate on that a bit as it's the main blocker for us at the moment when it comes to shipping the allow attribute.

annevk commented 5 years ago

An alternative that might work for us if we cannot come to agreement on #337 is that <iframe src=https://maps.example/ sandbox=allow-scripts allow="geolocation 'src'"> would simply not work since you allowed the https://maps.example origin, but the origin in the frame is opaque. In particular it's not clear to me that storing and keeping track of the opaque origin only for this use case is really justified.

w3c / webappsec-permissions-policy

Need to define how 'src' works with sandboxed frames #230