Consider shared caching

joelweinberger commented 8 years ago

We've had a lot of discussions about using SRI for shared caching (see https://lists.w3.org/Archives/Public/public-webappsec/2015May/0095.html for example). An explicit issue was filed at w3c/webappsec#504 suggesting a sharedcache attribute to imply that shared caching is OK. We should consider leveraging SRI for more aggressive caching.

btrask commented 8 years ago

I hope this is a reasonable place to comment. (If not please tell me where to go.)

I've been working on content addressing systems for several years. I understand that content addresses, which are "locationless," are inherently in conflict with the same-origin policy, which is location-based.

An additional/alternate solution is for a list of acceptable hashes to be published by the server at a well-known location.

For example, the user agent could request https://example.com/.well-known/sri-list, which would return a plain text file with a list of acceptable hashes, one per line. Hashes on this list would be treated as if they were hosted by the server itself, and thus could be fetched from a shared cache while being treated for all intents and purposes like they were fetched from the server in question.

This does add some complexity both for user agents and for site admins. On the other hand, the security implications are well understood, and wouldn't require new permission logic.

Thanks for your work on SRI.

joelweinberger commented 8 years ago

An interesting idea (although I know many folks who are vehemently against well-known location solutions, but I won't pretend to fully grasp why). If implemented, though, it would still require a round trip to get .well-known/sri-list, right? Which seems to lose a lot of the benefit of these acting as libraries.

Another suggestion, that I think I heard somewhere, is, if the page includes a CSP, only use an x-origin cache for an integrity attribute resource if the CSP includes the integrity value in the script-hash whitelist. I think this would address @mozfreddyb's concerns listed in Synzvato/decentraleyes#26, but I haven't thought too hard about it. On the other hand, it also starts to look really weird and complicated :-/

Also, these solutions don't address timing attacks with x-origin caches. Although, as a side not, someone recently pointed out to me that history timing attacks in this case are probably not too concerning from a security perspective since it's a "one-shot" timing attack. That is, the resource is definitively loaded after the attack happens, so you can't attempt the timing again, and that makes the timing attack much more difficult to pull off, since timing attacks usually rely on repeated measurement.

btrask commented 8 years ago

Using a script-hash whitelist in the HTTP headers (as part of CSP or separately) is better for a small number of hashes, since it doesn't require an extra round trip. Using a well-known list is better for a large number of hashes, since it can be cached for a long time.

I agree that well-known locations are ugly. Although it works for /robots.txt and /favicon.ico, there is a high cost for introducing new ones.

The privacy problem is worse than timing attacks: if you control the server, you can tell that no request is ever made. This seems insurmountable for cross-origin caching.

Perhaps the gulf between hashes and locations is too large to span. For true content-addressing systems (like what I'm working on), my preference is to treat all hashes as a single origin (so they can't reference or be referenced by location-based resources).

Thanks for your quick reply!

mozfreddyb commented 8 years ago

I'd be slightly more interested in blessing the hashes for cross-origin caches by mentioning in the CSP. .well-known would add another roundtrip. I'm not sure if that's going to impact hamper the performance benefit that we wanted in the first place.

The idea to separate hashed resources into their own origin is interesting, but I don't feel comfortable drilling holes that deep into the existing weirdness of origins.

btrask commented 8 years ago

To be clear, giving hashes their own origin only makes sense if you are loading top-level resources by hash. In that case, you can give access to all other hashes, but prohibit access to ordinary URLs. But that is a long way off for any web browsers and far from the scope of SRI.

mozfreddyb commented 8 years ago

For the record, @hillbrad wrote a great document outlining the privacy and security risks of shared caching: https://hillbrad.github.io/sri-addressable-caching/sri-addressable-caching.html

kevincox commented 8 years ago

That document doesn't appear to consider an opt-in approach. While this would reduce the number of people who do it it could be quite useful.

<script src=jquery.js integrity="..." public/>

This tag should only be put on scripts for which timing is not an issue. Of course deciding what is pubic is now the responsibility of the website. However since the benefit would be negligible for anything that is website specific this might be pretty clear. For example loading a script specific to my site has a single URL anyways, so I may as well not put public otherwise malicious sites can figure out who has been to my site recently even though I don't get any benefit from the content-addressed cache. However if I am including jQuery there will be a benefit because there are many different copies on the internet and at the same time it means that knowing whether a user has jQuery in their cache is much less identifying.

That being said if FF had a way to turn this on now I would enable it, I don't see the privacy hit to be large and the performance would be nice to have.

hillbrad commented 7 years ago

If I want to use the presence of my script in a shared cache to track you illicitly, I will deliberately set the public flag, even if the content isn't actually public.

On Mon, Oct 31, 2016 at 3:06 PM Kevin Cox notifications@github.com wrote:

That document doesn't appear to consider an opt-in approach. While this would reduce the number of people who do it it could be quite useful.
Githubissues.
Githubissues is a development platform for aggregating issues.

w3c / webappsec-subresource-integrity

Consider shared caching #22