whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.12k stars 331 forks source link

Cache-Friendly Access-Control-Allow-Origin #890

Open nigoroll opened 5 years ago

nigoroll commented 5 years ago

The CORS response header Access-Control-Allow-Origin currently only allows two possible values, the wildcard * or the Value of the Origin request header. If more than one Origin is allowed, a Vary: Origin response header is required for interoperability with caching intermediaries downstream and the client cache. Consequently, caches will need to maintain one copy of the respective object for each possible Origin value. Note this will also multiply the body even if the only thing varying across objects is Access-Control-Allow-Origin. Cache multiplication implies storage overhead and lower hit rates, and, consequently, worse latencies. To avoid the cache multiplication, the only option known to me is custom logic at the edge.

To solve this issue, the value of the Access-Control-Allow-Origin response header would need to support multiple values, possibly allowing some pattern or other means to span multiple authorities.

Ideas:

Whatever extensions are added to individual components of Access-Control-Allow-Origin, a list of allowed origins or patterns would be required.

Example:

Access-Control-Allow-Origin: samesite:example.com, https://*.sub.bar.com:8443, http://bar.com

I do not cover the question of the upgrade path in here, but something along the lines of the one laid out in origin-policy or this old post should work.

Thank you to @annevk and @yoavweiss for the discussion during the http workshop 2019 and the samesite idea

Old reference: https://lists.w3.org/Archives/Public/public-webappsec/2014Apr/0060.html

annevk commented 5 years ago

I'm not a big fan of allowing * as it requires defining a detailed parsing and processing model. And where you allow "any" label has some security implications too.

Supporting same-site makes sense to me. And perhaps allowing the listing of a registrable domain makes sense too. That would still allow a fairly simple processing model.

Allowing multiple values also makes sense to me and would help various scenarios.

annevk commented 5 years ago

cc @whatwg/security

Jxck commented 5 years ago

@annevk What is the risk of allow "*.example.com" in Access-Control-Allow-Origin ? (or why current spec doesn't allow ?)

annevk commented 5 years ago

@Jxck the main reason is that it's much more complex than the alternative (same-site + multiple values). If you're curious as to why it's more complex, I'd encourage you to try to write a detailed processing model for it. I.e., write the parser that turns the strings you want developers to be able to write into a data model and then write the matching algorithm for that data model.

Jxck commented 5 years ago

@annevk ok, my question of risk means allowing sub-origin (subdomain of https://example.com) in access-control-allow-origin rather than specify single origin (just https://example.com). and you mean there are reasonable to allow them, but notation of https://*.example.com is hard to standardize (parsing/matching model etc). but samesite has done them already, so it's possible to use them instead of https://*.example.com notation. is that right ?

annevk commented 5 years ago

Yeah, I'd prefer syntax that avoids new parsing and model requirements. I have these extensions in mind, but they require some implementer interest (which I think we have) and people to do the work (of writing the text and tests):

Jxck commented 5 years ago

@annevk few more questions. old cors spec seems allow 'origin-list-or-null' for A-C-A-O. and its space separated list of serialized origins. https://www.w3.org/TR/cors/#access-control-allow-origin-response-header

1, why this isn't imported to fetch spec now? 2, current browser return error for multiple origins , so it seems browser can parse them and deny. so I thought there are security consideration for multiple origins rather than parse model error. but if parse model standardized, this is not considered error anymore? 3, why you prefer comma separated rather than space separated style in old-cors spec in your 3rd choice?

annevk commented 5 years ago

That syntax was not for what you think it was for. If a user agent made a request and that request went via multiple cross-origin redirects, the user agent could list all the origins in the Origin header with that syntax. And the response to that request would have to echo the exact value. However, all user agents decided to treat that as null instead as "stack" inspection was deemed insecure. There never was a feature to allow requests from multiple origins with a single static value (other than *).

So, to answer your questions:

  1. It was never implemented.
  2. They return an error because it doesn't match the Origin header of the request.
  3. Because space-separated meant something else and more importantly comma-separated is the standard syntax for multi-value HTTP headers.
Jxck commented 5 years ago

@annevk Thanks a lot ! I understand current spec correctly.

then, I think multiple allow origin with comma separated, and also samesite seems good to me.