w3c / ServiceWorker

Service Workers
https://w3c.github.io/ServiceWorker/
Other
3.63k stars 313 forks source link

"no-cors" CSS SOP violation #719

Open annevk opened 9 years ago

annevk commented 9 years ago

Per our current set of definitions a service worker reveals what resources a "no-cors" CSS stylesheet attached to a document loads. In particular this can leak confidential tokens in the URLs.

Entered the public record here: http://krijnhoetmer.nl/irc-logs/whatwg/20150703#l-286

According to @jakearchibald resource timing (paging @igrigorik) did this first, in both Chrome and Firefox.

I think we should revert both, seems like bad precedent to cut more holes in SOP.

jakearchibald commented 9 years ago

Most requests by CSS are exposed through computed styles, and it's pretty trivial to iterate over all elements to find those. Things like font urls & @import cannot be detected through computed styles, but are exposed by SW & resource timing.

I see CSS more like script, although it's "opaque", it gives up some visibility when it makes requests within the context of the page. Although, as @annevk points out, script gives up visibility by using globals that can be modified, whereas CSS doesn't.

annevk commented 9 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1180145 tracks this for Gecko.

jakearchibald commented 9 years ago

@davidben @mikewest what's your take here, are we (and resource priorities) breaking the web by exposing CSS-initiated fetches?

igrigorik commented 9 years ago

Opened a bug on Resource Timing to track this (https://github.com/w3c/resource-timing/issues/27).

jakearchibald commented 9 years ago

Ping @davidben @mikewest

We have two options here:

  1. Declare CSS-triggered fetches as "page visible" as they are with script-triggered fetches. However, this is leaking new data when it comes to URLs that don't appear in computed styles. Eg font sources, @import urls.
  2. as Anne suggests, remove these from the resource timing API & bypass serviceworker for these requests.

This will trip developers up who host their CSS on a CDN, as it's less likely to have access-control-allow-origin, but it's not worth breaking the web over.

igrigorik commented 9 years ago

@jakearchibald silly question: if you hide CSS fetches from SW, how am I supposed to offline my app with background images and fonts? Wouldn't that effectively mean that "if you want to be SW-compatible, you should avoid CSS fetches"... because that would break the web. I don't see why we're making this distinction between JS and CSS (JS fetches are "computed fetches" also).

annevk commented 9 years ago

No-cors cross-origin CSS subresources would be hidden. Any other kind of CSS subresources would not be hidden. So there's still plenty of ways to do business. (JavaScript fetches are already observable, as explained upthread.)

jakearchibald commented 9 years ago

@igrigorik

if you hide CSS fetches from SW, how am I supposed to offline my app with background images and fonts?

CSS fetches would only be hidden from SW if the CSS is no-cors cross-origin. You'd be able to get them back with <link rel="stylesheet" href="…" crossorigin>, provided CORS headers.

igrigorik commented 9 years ago

CSS has no author-level equivalent for "crossorigin" for fetches declared via url() - i.e. I can't specify a CORS policy on resources specified within the CSS file. As such, all url() initiated fetches (read, almost all CSS fetches modulo some odd exceptions like web fonts), would become invisible? Unless I'm missing something here, that would break the web.

Why can SW see and intercept no-cors crossorigin <img> initiated fetches, but same use case then fails under CSS? This seems inconsistent and broken.

p.s. the whole crossorigin business is a total mess - see https://github.com/w3c/resource-hints/issues/32.

annevk commented 9 years ago

No, it depends on how the CSS resource was fetched. Not on how the CSS resource's subresources are fetched.

igrigorik commented 9 years ago

@annevk ah, ok.. So, coming back to @jakearchibald earlier point, what exactly would be "hidden" here if most URLs are already observable through computed styles? Jake called out fonts earlier, but afaik, those are visible via http://dev.w3.org/csswg/css-font-loading/.

annevk commented 9 years ago

Those observable through computed styles are only exposed if you know where to look, which can be expensive. Very different characteristics from getting a list. And e.g. @import is not affected by that. And it might very well be that CSS font loading has a bug too if it's claiming to be available for no-cors CSS.

annevk commented 9 years ago

@tabatkins ^^

tabatkins commented 9 years ago

FontFace objects don't expose the url (or ArrayBuffer) they were loaded from. The FontFace itself is there as long as CSS has rights to it (font loading is CORS-restricted per spec), but you can't read anything from it.

annevk commented 9 years ago

@tabatkins no-cors CSS is applied to a page, but you can't access it through CSSOM (you can access computed styles). So the question is what the font API does for such a CSS resource (I suspect it shouldn't work, but that may not be defined).

davidben commented 9 years ago

Oh yuck. Yeah, I think I agree with Anne that we should remove these requests from SW and Resource Timing unless you add the crossorigin attribute. These kinds of "the contents are secret, but if they happen to parse as foo, you can execute it" security policies are super-hairy. We shouldn't add new ones.

In fact, cross-origin CSS has already bitten us in the past because the CSS parser is extremely error-tolerant. See https://www.linshunghuang.com/papers/css.pdf

davidben commented 9 years ago

Oh, and I believe this is the corresponding WebKit bug, since I don't see it linked in the paper: https://bugs.webkit.org/show_bug.cgi?id=29820

ETA: To clarify, this is the WebKit bug for the paper in the above comment that describes a similar issue, not the WebKit bug for the issue being discussed here.

tabatkins commented 9 years ago

Yeah, makes sense. I'll send a notification to www-style and make the change in Font Loading.

tabatkins commented 9 years ago

Actually, I ran into some questions about how to implement. I posted https://lists.w3.org/Archives/Public/www-style/2015Jul/0150.html with two possibilities for handling this.

The proposal I prefer still exposes the existence of FontFace objects from the tainted stylesheet, and their loading status and promise. Everything else is hidden, tho.

(If you know the name of the font, its loading status is already exposed to the page via layout/timing channels. This approach would further expose the number of fonts present in tainted sheets and the loading status/promises of the mystery fonts, which wasn't previously available, but any further restrictions seem kinda insane to spec/implement.)

ETA: I could maybe further hide the load status/promise and only expose it when you retrieve the FontFace via one of the "by name" methods, but that adds a lot more complexity.) Or maybe just expose one opaque slab per stylesheet? Then the query methods can return a slab representing just the fonts actually used, so you can load them and view their load status in synchrony. This is kinda similar to how I'd like to expose local fonts at some point.

nattokirai commented 9 years ago

Creating new flavors of FontFace seems like overdesign to me. If style rules can't be exposed in the OM, I don't think FontFace objects should be exposed in the FontFaceSet. The ready() promise should include fonts from the inaccessible stylesheet.

This is kinda similar to how I'd like to expose local fonts at some point.

Allowing local fonts to be enumerated exposes users to fingerprinting attacks.

tabatkins commented 9 years ago

Creating new flavors of FontFace seems like overdesign to me. If style rules can't be exposed in the OM, I don't think FontFace objects should be exposed in the FontFaceSet. The ready() promise should include fonts from the inaccessible stylesheet.

I explained in the email why that's complicated and probably unworkable. Please comment on www-style for further discussing of FontFace/etc.

Allowing local fonts to be enumerated exposes users to fingerprinting attacks.

I didn't mention enumerating local fonts; my suggestion was for the exact opposite, actually.

And you can already enumerate local fonts fairly trivially through layout channels.

horo-t commented 9 years ago

https://crbug.com/532374 tracks this for Chromium.

wanderview commented 9 years ago

Some corner cases:

Consider:

1) a.com/index.html loads stylesheet at b.com/foo.css as no-cors 2) b.com/foo.css @imports stylesheet at a.com/bar.css 3) a.com/bar.css loads background-image a.com/snafu.jpg

Should SW and performance see data for snafu.jpg? I think everything @imported under a tainted sheet should be hidden.

@igrigorik, note this is not really covered by the current language in the performance spec:

For each resource fetched by the current browsing context, excluding resources fetched by cross-origin stylesheets fetched with no-cors policy, perform the following steps:

This raises further issues like this:

1) a.com/index.html loads stylesheet at b.com/foo.css as no-cors 2) b.com/foo.css @imports stylesheet at a.com/bar.css 3) a.com/bar.css loads background-image a.com/snafu.jpg 4) a.com/index.html loads stylesheet at a.com/thepain.css 5) a.com/thepain.css @imports stylesheet at a.com/bar.css 6) a.com/bar.css loads background-image a.com/snafu.jpg

Here snafu.jpg should be hidden in step 3, but what happens to snafu.jpg in step 6? It seems the stylesheet should not be shared from tainted import with non-tainted import. I'm not sure if the image cache would prevent a network load in step 6, though.

sicking commented 9 years ago

I don't think we should prevent requests coming from a no-cors stylesheet from going through a SW.

Yes. I understand that this exposes data that isn't currently exposed. And yes, I understand that there is some security risk involved in that. But I don't think that automatically means that we must not do this.

In short, I think the question is more complex and nuanced than that.

Currently, if I host a file on a webserver and serve it with a "text/css" mimetype, a lot of the data in that file can be read by third parties.

A third party can link to that stylesheet, create an element which matches one of the rules in the stylesheet, and then use getComputedStyle to figure out the properties that were set by the rule, and what those properties were set to.

You can relatively easily use a dictionary to scan probable class names to extract all rules that simply match on a class. You can even scan all possible ascii-classnames shorter than N characters.

That means, that as an author, if I put a text/css resource on a webserver, I already need to count on that most of the bytes in that file is readable by third parties.

It's also not as simple as "exposing more data" == "less safe for developers".

Trying to send developers the message that "these here bytes are leaked, but these other bytes, in the @import rule, those are safe, feel free to put sensitive data in those URLs" is a significantly more complex than saying "the contents of the stylesheet can be read by third parties. Don't put sensitive data in stylesheets". Such complexity often leads to security bugs.

For example, the fact that background-image URL isn't exposed to a SW might make developers think that the URL is protected, when in fact .getComputedStyle leaks the very same information. I.e. it is very easy for developers to get a false sense of security. This is also made worse by the fact that developers can legitimately claim that some urls, like @import, are in fact safe.

Then there is of course the problem that the implementation complexity will most likely lead to a few security bugs here and there. We had a recent example of this where a developer had seen that CORS says that only certain values can be sent in the Content-Type header without triggering a preflight. So the developer built CSRF-protection by checking for a particular content type. However it turned out that different browsers apply slightly different rules to Content-Type parsing leading to the ability for an attacker to send a cross-site request in some browsers with a Content-Type that this website didn't expect.

The Content-Type rule in CORS is my fault. I think it was a bad decision and it has lead to security problems.

What we're talking about here is orders of magnitude more complex, and so I feel quite confident that it will lead to orders of magnitude more problems. Both security problems and other problems.

And yes, I am aware that we block access to the CSSOM for cross-origin stylesheets that weren't fetched with CORS. But that was added because at the time we didn't have the restriction that the stylesheet had to be served as "text/css". The attacks that we saw was due to websites pointing at cross-origin HTML files, and then took advantage of the lax CSS parsing rules to extract information from the HTML.

So, all in all, I think the cost here is quite high to browser developers, as can be seen by all the edge cases that are being debated. I agree that there is some security value to developers, but I also think there is a very real security risk to developers (in addition to the cost of not being able to intercept CDN hosted stylesheet resources).

And again, I realize that more information is being exposed here. If your counter argument is simply that that means that we have to block these intercepts, then you might want to reread this comment.

I think we would be better served giving developers a real security tool here. I know that @annevk had a proposal some time back for a header which blocked cross-site reads of resources like images/stylesheets/scripts. Implementing something like that seems like a better way to protect the data debated here.

sicking commented 9 years ago

FWIW, what I think we should do is:

The second bullet has been discussed before, but now has the nice property that it would actually protect the information debated here on all UAs. Old UAs that don't support the property also don't support SW.

wanderview commented 9 years ago

Some IRC discussion:

http://logs.glob.uno/?c=freenode%23whatwg#c974181

jakearchibald commented 8 years ago

F2F: we need browser security people to weigh in on this

yoavweiss commented 8 years ago

Did sec folks comment on this?

wanderview commented 8 years ago

AFAIK, no they have not. Its been 6 months and this API has been released for 1+ years in one form or another. At this point I feel like we should probably just close this as WONTFIX. Of course, if security knowledgeable people feel strongly otherwise that feedback is quite welcome.

annevk commented 8 years ago

I still think poking holes in SOP is a bad idea.

sicking commented 8 years ago

A slightly larger nice round hole is better than a irregular jagged hole which in practice can't be depended on to be any smaller than the round hole.

On Wednesday, October 12, 2016, Anne van Kesteren <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

I still think poking holes in SOP is a bad idea.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/ServiceWorker/issues/719#issuecomment-253223818, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvz6qgfUlPethMCTQy-bWci2q9FDacQks5qzOoEgaJpZM4FRUVz .

jakearchibald commented 7 years ago

@mikewest we need help making a call here. This has been in the wild for a long time and doesn't seem to have caused a problem, but we're worried.

We should also add some use counters for this.

@domenic & @jakearchibald need to look through https://www.linshunghuang.com/papers/css.pdf

jakearchibald commented 7 years ago

We should have a VC about this, with security people.

hober commented 7 years ago

cc @johnwilander

johnwilander commented 7 years ago

Small note: WebKit does not CORS-restrict font loading (https://bugs.webkit.org/show_bug.cgi?id=86817).

tdresser commented 7 years ago

What's the next step here?

annevk commented 7 years ago

I suspect it's enshrining a security hole because nobody acted.

jakearchibald commented 7 years ago

Action: @slightlyoff is cornering security folks RIGHT NOW.

jakearchibald commented 7 years ago

F2F: We need to watch out for this if workers become subresoruces, and allow cross origin with no-cors.

slightlyoff commented 7 years ago

/cc @mikewest @estark37

shhnjk commented 6 years ago
  1. Add "css" (and "worker" if required) in request initiator.
  2. Add opaque-subresource request (near subresource request).

    A opaque-subresource request is a request whose initiator was fetched with request mode "no-cors" and response was cross-origin (should be replaced with better sentence).

  3. Add following step in between step 10 and step 11 of handle fetch

    If request is a opaque-subresource request, then: Return null.

This should do. Let's fix this.

wanderview commented 6 years ago

response was cross-origin

@shhnjk, what is this cross-origin compared to? The client initiating the request? Or the document/stylesheet initiating the request?

I'm just curious if we care about the case where we have:

  1. Document with origin foo.com load stylesheet "foo.com/A.css"
  2. Stylesheet "foo.com/A.css" does @import("bar.com/B.css")
  3. Stylesheet "bar.com/B.css" does @import("foo.com/C.css")

Is the "foo.com/C.css" load considered same-origin or cross-origin for the purposes of this check? If its considered same-origin then it seems some information about the cross-origin "bar.com/B.css" has been leaked.

The situation is similar to how we handle CORS and redirects. Once you redirect through a cross-origin that request is tainted as cross-origin even if its redirects back to same-origin.

shhnjk commented 6 years ago

what is this cross-origin compared to? The client initiating the request? Or the document/stylesheet initiating the request?

Client. We should not check same-origin/cross-origin based on stylesheet.

I'm just curious if we care about the case where we have: 1 Document with origin foo.com load stylesheet "foo.com/A.css" 2 Stylesheet "foo.com/A.css" does @import("bar.com/B.css") 3 Stylesheet "bar.com/B.css" does @import("foo.com/C.css") Is the "foo.com/C.css" load considered same-origin or cross-origin for the purposes of this check? If its considered same-origin then it seems some information about the cross-origin "bar.com/B.css" has been leaked.

Step 3: Initiaor: "bar.com/B.css" Initiaor was fetched with "no-cors": true Initiator was cross-origin: true return null

So this will not leak the info. But SW point of view, you might want to serve it from cache though. My solution is based on security not performance.

BTW, Add "css" (and "worker" if required) in request initiator might not be required. we just need to check the intiator's destination was "style" (or "worker").

yoavweiss commented 6 years ago

Any news on this? This is blocking https://github.com/w3c/resource-timing/issues/70

rniwa commented 6 years ago

WebKit is wiling to change its behavior to avoid exposing resources fetched by non-CORS cross-origin CSS resources. We don't think it's okay to leak information about non-CORS cross-origin resources like this.

jakearchibald commented 6 years ago

My concerns about this:

It's difficult for developers to understand which parts of their CSS is protected when loaded cross-origin:

//cross-origin/user.css

@import 'user/styles/david-smith.css';

//cross-origin/user/styles/david-smith.css

/* This is David's favourite font */
@font-face {
  font-family: 'david-loves-consolas';
  src: url('/fonts/consolas-v123.woff2');
}

body {
  font-family: david-loves-consolas;
}

.user-avatar {
  background: url('/images/me-and-samantha.jpg');
}

.davids-mother-is-called-jane {
  background: green;
}

Assuming we make the change proposed in this issue, what's private?

We can look at the computed style for body, and determine that the user is called david, and they love consolas. Again, using computed styles, we can determine they have a friend or relative called samantha, who is displayed in the image. We could use guesswork to figure out David's mother is called Jane.

The bits of information that this PR protects is David's surname, as it's only in the @import which isn't usually exposed. We also protect the version of the font (I think? Or is this exposed elsewhere?).

My advice to developers would be "Don't put private data in CSS". I realise we do our best to hide the source, as we do with JS, but we don't hide fetches from JS.

annevk commented 6 years ago

@jakearchibald one of the reasons JavaScript modules requires CORS is precisely to not have to hide import fetches from service workers. We also do not provide CSSOM access to cross-origin style sheets for this reason. Would you try to argue we should? And yes, you should probably not put private information in CSS, but we should also not regress in what we protect.

yoavweiss commented 6 years ago

FWIW, https://jsbin.com/pigihubuxa/edit?html,output shows that at least Chrome and Safari do expose CSSOM (and the background image URL info) for default cross-origin CSS fetches. Firefox does not.

jakearchibald commented 6 years ago

@annevk

We also do not provide CSSOM access to cross-origin style sheets for this reason. Would you try to argue we should?

No, but "we expose some URLs from cross-origin CSS" should probably go one way or the other.

jakearchibald commented 6 years ago

As an aside, isn't script and CSS source highly vulnerable to meltdown/spectre?