Multi-author sites - Githubissues

mnot commented 9 years ago

As discussed, https:// sites that have multiple authors may be surprised to discover that user a can now overwrite content for user b.

What's needed: 1) text (Security Considerations?) explaining the attack 2) evangelisation to warn sites 3) maybe a csp-ish opt-out (or -in, but @slightlyoff isn't hot on that)

Will do a pull for 1.

slightlyoff commented 9 years ago

There has been quite a lot of further discussion on this point. Looks like we're going to sacrifice this goat to accommodate sites which, frankly, are already broken. Le sigh.

We're likely going to go with a CSP-based OPT IN to enable Service Workers. The straw man is to require a CSP header for the on-origin SW script.

Thoughts?

KenjiBaheux commented 9 years ago

Filed crbug.com/423983 for tracking in Blink

annevk commented 9 years ago

The amount of developer hurdles keeps growing, but we have required opting into Danger Zone for less.

Are we still requiring that the SW is served with a correct JavaScript MIME type?

A CSP header on the SW script itself affects what the SW can do. It would make more sense to me if the opt-in came with the document or worker that wanted to use a SW.

slightlyoff commented 9 years ago

Yes, still requiring valid JS. The CSP header affecting what the SW can do seems good? We want more people setting CSP, no?

Do you have a straw-man for another way of doing this that you prefer?

annevk commented 9 years ago

A new token for Content-Security-Policy: serviceworker. To be set by global environment that is to register service workers rather than the service worker script.

inexorabletash commented 9 years ago

We need to make sure some spec says you can't set this new CSP token via meta tags.

(I'm a CSP n00b, but it looks like this is the first opt-in CSP directive? Are there any other places where the general opt-out design of CSP will require special handling?)

jakearchibald commented 9 years ago

I don't think page-based CSP is a good idea here, the impact of a SW is beyond the page. Also, CSP is so-far opt-out, and we're proposing opt-in.

I think we should reconsider the SW script location when it comes to scope. /~jakearchibald/my/app/sw.js may not be registered for scopes wider than /~jakearchibald/my/app/. There's been confusion about this when I've proposed this in the past, so I want to stress that it's the location of the serviceworker script that limits scope, and not the registering page.

The benefit of this is we don't break hosts that are safe, such as github.

If we must go for an opt-in solution, I'd rather we went for a content type like application/serviceworker+javascript, or application/javascript;serviceworker, or something along those lines. But I much prefer a solution that doesn't break hosts that happily give you your own origin.

dominiccooney commented 9 years ago

Implementer feedback: While we haven't looked at this in great depth yet, our plan a this point is to implement the CSP thing (token or header) on the Service Worker script. Our reasoning is:

If an attacker "takes a site offline" with a Service Worker, headers on pages are ineffective redress. The only requests the site will see will be for Service Worker scripts.
Having the header on the script ameliorates script reflection (eg JSONP) + XSS attacks for sites that do use Service Workers.
Having the metadata on the script and not the page avoids an issue with meta tags, if there is one. We didn't look into that.

The solutions in @jakearchibald's previous comment would be easier for us, frankly.

domenic commented 9 years ago

I really like the content-type idea, mainly because then static content servers could come configured out of the box with a .swjs -> application/javascript+serviceworker mapping, and then once that gets rolled out to GitHub pages we could use service workers there. (Whereas, with the CSP solution, we're unlikely to ever get a generic host like GitHub pages to work with service workers.)

jakearchibald commented 9 years ago

@domenic what are your thoughts on the path-based method? It has the benefit of working securely without changing servers. Also, appcache uses the same method for restricting how FALLBACK works.

domenic commented 9 years ago

@jakearchibald sounds pretty good. A tiny bit ugly since I like to keep my JS nice and tidy inside a /js or /assets folder, but I could make an exception for my service worker.

jakearchibald commented 9 years ago

@annevk @slightlyoff can I persuade you to reconsider the service-worker script path approach?

Failing that, a special content-type. This avoids the messiness of a CSP opt in, and confusions around which client's CSP it applies to.

annevk commented 9 years ago

1) A special path seems rather magical. We have a few of those (e.g. /favicon.ico) and they are not liked much.

2) A MIME type works, but once servers start deploying that in the default configuration you're lost again, no (also, if any it should be something like text/sw+javascript)?

jakearchibald commented 9 years ago

1) A special path seems rather magical. We have a few of those (e.g. /favicon.ico) and they are not liked much.

I think having to add special headers is liked less.

annevk commented 9 years ago

That is fair and this is less constrained than /favicon.ico and /robots.txt. You can still pick the name, as long as it does not contain a slash.

slightlyoff commented 9 years ago

So a few thoughts:

I dislike the notion that it's CSP on the installing page that controls this. That doesn't comport with the design constraint imposed by the folks who have driven this that any random page not be able to install a SW. That means you need to be able to show affirmative control over the server in some way, and a CSP header that's different in some way to what you can set in the <meta> tag variant is...bad.
I want this to be fused to the script
Paths are bad, bad, bad. They don't actually deal with all of John's examples and I'm dubious that they are valuable.

This leads me to suggesting a header on the script file and, if we can't agree on CSP for the script file, thinking we should do something like Service-Worker-Allowed: true.

Reactions?

annevk commented 9 years ago

Pointer to John's examples? And an explanation of sorts why paths are bad and don't address the examples if that's not self-evident.

jakearchibald commented 9 years ago

CSP is opt out, and CSP blocks requests happening not responses being used. Since we're talking about something that's opt-in and blocks on response, I can't understand why we think CSP is the answer here.

The path solution means github pages, other github raw viewers, and tilde-based sites just work. If you can put a script there, you control it.

jsbin.com allowed you to put a script in the root, which is highly unusual, and @remy already fixed that.

johnmellor commented 9 years ago

My examples of common sites that miss out on Same-origin policy protection (because they put content from non-mutually-trusted users on a single origin) were:

Most universities (https://www.stanford.edu/~username/, https://www.princeton.edu/~username/, etc)
https://cdn.rawgit.com/user/repo/branch/file
https://jsbin.com/foo
probably some intranets (using a similar model to universities)
probably some CDNs or ad networks

Of these jsbin.com was the only one which didn't use paths to separate content from different users, and that's now fixed. It's likely there are more such sites (maybe some sites upload everything to the root with a hashed filename, or use paths within the query string, like /index.php?path=/~username) but these seem rather rarer compared to sites which use paths to separate users.

inexorabletash commented 9 years ago

http://www.w3.org/2001/tag/issues.html#siteData-36 should be considered.

From Tim Berners-Lee:

The architecture of the web is that the space of identifiers on an http web site is owned by the owner of the domain name. The owner, "publisher", is free to allocate identifiers and define how they are served.

Any variation from this breaks the web.

Path-based restrictions run afoul of this, albeit not as badly as /favicon.ico or /robots.txt, since we wouldn't be saying "you MUST use this identifier", merely "if you want to do XYZ, you MUST structure your identifiers as..."

jakearchibald commented 9 years ago

We're trying to fix cases that already run afoul of that. Cases where part of an origin are pseudo-owned by someone else. On 21 Oct 2014 18:19, "Joshua Bell" notifications@github.com wrote:

http://www.w3.org/2001/tag/issues.html#siteData-36 should be considered.

From Tim Berners-Lee:

The architecture of the web is that the space of identifiers on an http web site is owned by the owner of the domain name. The owner, "publisher", is free to allocate identifiers and define how they are served.

Any variation from this breaks the web.

Path-based restrictions run afoul of this, albeit not as badly as /favicon.ico or /robots.txt, since we wouldn't be saying "you MUST use this identifier", merely "if you want to do XYZ, you MUST structure your identifiers as..."

— Reply to this email directly or view it on GitHub https://github.com/slightlyoff/ServiceWorker/issues/468#issuecomment-59963635 .

remy commented 9 years ago

Just to chime in, I didn't fix it by separating out paths for users. http://jsbin.com/foo.js is still valid. We simply serve scripts identifying themselves as service workers: https://github.com/jsbin/jsbin/commit/ce53bb2218564d85e1620945a048662f98943ad2

jsbin.com allowed you to put a script in the root, which is highly unusual

The internet is a big place, "highly unusual" is going to be a much bigger number than you anticipated when this thing is shipped.

If you take the path scoping approach, I suspect (aka gut feeling) that in years to come, it'll catch out new devs, mostly because web devs tend to throw something together/copy & blindly paste before reading the fine print specs. And by catch out, I mean some will use path scoping and accidently protect themselves, others won't and it could be too late.

jakearchibald commented 9 years ago

@johnmellor

maybe some sites upload everything to the root with a hashed filename, or use paths within the query string, like /index.php?path=/~username

I wonder how many of these sites would serve /index.php?path=/~username/sw.js with a js content-type.

@remy

If you take the path scoping approach, I suspect (aka gut feeling) that in years to come, it'll catch out new devs

If a site allows you to host anything at the root, you effectively own the origin anyway, you can compromise security in severe ways through stuff like cross-origin.xml. jsbin is unusual in that it allows you to host js on the root, but not much else. Are there any other sites that allow stuff like this (and have security expectations)?

jakearchibald commented 9 years ago

@Hixie You went with path-based security for appcache manifests. We're considering doing the same for service workers. Are you happy with it? Any issues with doing the same for SW js?

slightlyoff commented 9 years ago

Paths are out. We're going with a header. Which one?: CSP on the SW script or Service-Worker-Allowed: true?

dominiccooney commented 9 years ago

Implementer feedback: Blink is speculatively implementing Service-Worker-Allowed: true and will watch this bug for further developments.

jakearchibald commented 9 years ago

@slightlyoff I don't understand why paths are out? They either don't affect already deployed examples, or are easy to fix. They don't block SW on every friendly host. They follow a change already in the html spec. It'll hurt adoption way less than creating a new header.

You seem to be the only one against paths, what's your argument?

annevk commented 9 years ago

@inexorabletash also made an argument that was somewhat compelling, to me.

Did we stop considering MIME types altogether? I'd vote for text/sw+javascript. That way this has more of a chance of being supported on GitHub.

jakearchibald commented 9 years ago

Content type would be my second choice (but suffers from the default setup issue you mentioned).

I understand that paths feel icky in light of http://www.w3.org/2001/tag/issues.html#siteData-36, but this whole thread is about sites that think path-based ownership is a thing the platform should cater to.

annevk commented 9 years ago

There is a difference between allowing path-based ownership and forcing that model on everyone.

PaulKinlan commented 9 years ago

Just as an aside, we have in the past had massive issues getting small providers and large companies to be able to set their headers. Small sites have troubles because they don't control the infrastructure, large sites because they have huge dev-ops (or CDN's) and overhead that they have huge means there is amounts of trouble getting changes deployed that it effectively means they won't deploy SW

I have seen shared hosts (like universities) let users control their own .htaccess file, which pretty much means that I as a developer can set Service-Worker-Allowed: true for the domain.

What was the reason why in the register call, the UA doesn't reject scopes that are not for the current path or further down? The location of the SW shouldn't matter in the long run right...

jakearchibald commented 9 years ago

I understand that http://www.w3.org/2001/tag/issues.html#siteData-36 has come from on high, but how does this particular use of paths break the web?

I get how /robots.txt and /favicon.ico are bad, because a developer without control over those responses may want to control the behaviours and features they offer. In the ServiceWorker case, we're only breaking cases we specifically want to break, a developer with control over /~username/ getting control over /.

jakearchibald commented 9 years ago

I have seen shared hosts (like universities) let users control their own .htaccess file, which pretty much means that I as a developer can set Service-Worker-Allowed: true for the domain.

This is a strong point. Not just .htaccess, but PHP etc would let me serve a ServiceWorker with the appropriate headers.

annevk commented 9 years ago

It does not break the web and neither does /robots.txt, but you do put constraints on the URL space, which might affect the physical space for which servers might have policies in place (such as scripts needing to be in a vetted directory).

johnmellor commented 9 years ago

What was the reason why in the register call, the UA doesn't reject scopes that are not for the current path or further down? The location of the SW shouldn't matter in the long run right...

Two reasons for any path restrictions being based on the URL of the SW not the page:

The URL of the page can be trivially modified to be any same-origin URL. See https://github.com/slightlyoff/ServiceWorker/issues/253#issuecomment-43001664 and https://github.com/slightlyoff/ServiceWorker/issues/253#issuecomment-43016195. The only reliable approach would be for path restrictions to be based on the URL of the SW.
It's often valid and useful for pages on a subpath to register Service Workers on parent paths or even the root. If path restrictions are only based on the URL of the SW script, then /foo/bar/ can register /root.sw.js, and this use case seamlessly works.

jakearchibald commented 9 years ago

Two reasons for any path restrictions being based on the URL of the SW not the page:

And another: I could just add an iframe for "/" and inject whatever script I wanted to run from that document.

jakearchibald commented 9 years ago

Btw, I wrote http://jakearchibald.com/2014/launching-sw-without-breaking-the-web/ to collect developer feedback. Not a lot of feedback yet, but it's all in support of path-based security.

PaulKinlan commented 9 years ago

@johnmellor thanks for the info. At first I find that using the location of the SW file an odd choice initially but thinking about it feels pretty good. It would be yet another well known file. I was just thinking about how other services prove ownership of a piece of content (you might have done this already):

I looked at Webmaster tools and a number of others, and they all fall into the following:

File in the path. You prove that you can write to that directory (we do that for developers.google.com/web/fundamentals - we anything under this folder, above that it is the admins of the site)
Meta entry on the page in the path that you own - also very common. It's kind of like a shared secret and it is in your interest as a developer not to copy someone else's secret as they could then "own" your registration.
DNS txt entry. Proves that you have ownership over the entire domain, can't prove ownership over directories
Google Analytics: Looks for GA script tag only in the head of the page and hook it up with the backend and you can prove ownership of directories.

Looking at Bing, Yahoo, Google, Pinterest etc, they take proof of owning content in the path based as central to proving that you own a portion of the site and using SW location seems to fit this same model pretty well and a lot of developers are already used to this model.

jakearchibald commented 9 years ago

Of the 15 votes collected at http://jakearchibald.com/2014/launching-sw-without-breaking-the-web/:

10 votes for path-based
2 votes for opt-in header
1 vote for both path-based and opt-in header (requiring both)
2 votes for simple JS content type (as we had before the changes discussed in this thread)

joelweinberger commented 9 years ago

As a side note, someone incorrectly stated that CSP is always "opt-out". This is incorrect. Off the top of my head, the sandbox directive is opt-in, and there are other directives being considered that would be opt-in. Thus, I think that certainly should not be a blocking factor here.

jakearchibald commented 9 years ago

@metromoxie my wording isn't quite right. I mean CSP applies restrictions, rather than making things less restricted than the default. Or is that likely to change too?

annevk commented 9 years ago

@jakearchibald I think that might not be the case for the referrer directive.

jakearchibald commented 9 years ago

Hmm, I think you're right.

jakearchibald commented 9 years ago

Been chatting it through with @slightlyoff and have a proposal/compromise:

SW must have JS content type (as currently specced)
By default, your maximum scope is determined by the location of your service worker script, max scope is new URL('./', scriptURL) (this is what we've been calling the path method in this thread)
The script may be served with the header Service-Worker-Allowed: /scope/goes/here/ to explicitly set the maximum scope the worker may be registered for, allowing you more freedom in where your script is hosted (a script in /static/my/scripts/sw.js may be served with Service-Worker-Allowed: / to allow full-origin control)

The header means /~username/ hosts that allow setting headers on resources (via htaccess, PHP, whatever) allow a user to take over more URL space with ServiceWorker than they can without it (eg, the whole origin), but hosts that let people do this aren't secure to begin with.

slightlyoff commented 9 years ago

+1. Lets get this done.

dominiccooney commented 9 years ago

Implementer feedback: Blink can implement this. We will do the path restriction first and then add the header processing to relax the restriction.

mnot commented 9 years ago

My .02 -

Requiring JS content type is good. A special type doesn't do it; some multi-author hosts allow .htaccess.

The path method seems reasonable; AIUI a script at /foo/bar.js won't be able to intercept requests to /bar.

The Service-Worker-Allowed header means that anyone that has CGI or .htaccess (i.e., the ability to set a header) will be able to take over the entire origin. Some people are still going to find this surprising; despite the capabilities that being able to set headers already gives you, this is a lot more powerful.

I question whether the granularity of Service-Worker-Allowed is necessary. I think the same use cases (sites that don't want to be constrained by the path method) could be met without the risk by having a well-known URL that indicates the site opts out of the path restriction; e.g., /.well-known/serviceworker-open-scope or some such. Multi-author sites have a much better chance of controlling one URL path that isn't already delegated to an author than they do of controlling all response headers.

mfalken commented 9 years ago

Should the path restriction also apply to unregister() and getRegistrations()?

jakearchibald commented 9 years ago

Nah, no restrictions on getting registrations, registering for things like push messages on those registrations, unregistering them.

So yeah, I at /~jakearchibald/ could get hold of and unregister for the SW for /~timothysprocket/. I could even poision their cache and effectively "take over" their site if they're built cache-first, but they need to be using a ServiceWorker first. I can also already drop/poison their databases so meh.

Security between parts of an origin isn't a thing. We've gone (rightly) above and beyond to cut some slack to legacy sites that think it is a thing, but sooner or later they'll get caught out by storage poisoning, replaceState url spoofing, XSS (well SS, because it isn't X) etc.

annevk commented 9 years ago

In #445 we are discussing obsoleting the concept of scopes in favor of just origins (and suborigins in the future). At that point header opt-in and/or CSP for opt-out/in seems like our best bet.

w3c / ServiceWorker

Multi-author sites #468