Closed xtuc closed 1 year ago
Set-Cookie
is the only such header, see also HTTP and #506.
It might make sense to expose it properly for code that wants to use this class generically, but I think we ought to allow for implementations of this class to eagerly combine headers generally and store Set-Cookie
separately as a special case. That would suggest something like h.getSetCookie()
to me.
cc @youennf @ddragana @yutakahirano
It's probably also worth investigating if https://www.npmjs.com/package/set-cookie-parser#splitcookiestringcombinedsetcookieheader is a reasonable alternative to expose. Though if that actually works HTTP would not have to consider cookies a special case...
Maybe to clarify the environment a bit, Cloudflare Workers boils down to a proxy written in JavaScript and Node.js a HTTP client.
Set-Cookie is the only such header
For such use-cases, it make sense to not filter headers in the response and let the user define its own headers. Any custom headers may contain ,
.
I disagree, HTTP doesn't guarantee intermediaries won't combine such headers so we shouldn't either. I'd want HTTP to call out more than Set-Cookie
as requiring a special data model before going there.
I'm a bit confused why we'd revisit this. Non-browser environments can consider adding nonstandard extensions for whatever purpose they wish, but in browser environments there's no need for this functionality, and our reasoning from the previous discussions seems to hold. (And the spec is written for browser environments.)
We don't forbid appending Set-Cookie
unconditionally. When it's appended, there's no way to get it back out per HTTP semantics. That mismatch irks me a bit.
We are considering adding the getAll method for Cloudflare Workers, I don't know Node.js' plans.
With a basic filtered response you can access almost all the headers. I think this behavior can still happen in a browser.
To be clear, getAll()
with the semantics I think you are suggesting would be incompatible with what we'd add here, if anything, as HTTP doesn't require preserving the distinction between
h.append(name, value)
h.append(name, value2)
and
h.append(name, `${value}, ${value2}`)
for any name other than Set-Cookie
and this API shouldn't either. See also the discussion in #189.
Non-browser environments can consider adding nonstandard extensions for whatever purpose they wish, but in browser environments there's no need for this functionality, and our reasoning from the previous discussions seems to hold. (And the spec is written for browser environments.)
In Cloudflare Workers, we'd really like to stick to standard APIs, to promote code portability across a wide range of JavaScript environments. We think it would be a great thing for the JavaScript ecosystem as a whole if common functionality like making HTTP requests actually worked the same everywhere, so it would be nice to see the fetch()
standard as applying to more than just browsers.
We have an immediate need to support manipulation of Set-Cookie in Cloudflare Workers. Perhaps ironically, this is actually driven by a request from @rowan-m on the Chrome team related to the default SameSite=Lax change. A lot of people are going to need to work around this soon and doing it in CF Workers will likely be easier for many people than updating their backend code.
We are proceeding with implementing getAll()
because although it is deprecated, the past semantics cover our needs, and we're more comfortable implementing something that has been specified in the past than something that has never been specified. The thing we really want to avoid is specifying a new API that turns out incompatible with implementations from other vendors, fragmenting the ecosystem. So while getSetCookie()
might be better, getAll()
has the advantage that we're pretty confident no one is going to implement it with different semantics. (Yes, I do understand that various platforms may make different decisions about when to automatically combine headers other than Set-Cookie, but I'm comfortable with that.)
I think that's a really unfortunate decision. I think you'd be better served by clearly signposting nonstandard APIs (e.g. using a Cloudflare global), instead of assuming that things which existed briefly in the spec but were never implemented anywhere have some sort of semi-standard semantics.
I think you'd be better served by clearly signposting nonstandard APIs (e.g. using a Cloudflare global)
I thought vendor prefixes were no longer in favor?
existed briefly in the spec but were never implemented anywhere
Eh? Seems like it was implemented by almost every major browser at one point?
getAll()
is not deprecated, it's removed, because its design is broken as explained here and in linked issues.
@annevk Yes, I understand the issue -- for headers other than Set-Cookie, the header values may be comma-concatenated anyway by any intermediate proxy, so getAll()
can't keep its promise.
Do you have a recommendation for how we should proceed, in order to both solve the immediate problem and stay standards-compatible long-term?
Again, not just by any intermediate proxy, by an implementation of the Headers
class as well, as at least one implementer strongly argued for.
getSetCookie()
still seems like a reasonable solution. @mnot thoughts?
getSetCookie()
sounds good to me now. Could we consider adding it to the browser fetch spec?
What about getAll()
, but it is defined to throw an exception if passed any other header name than "Set-Cookie"?
This has the advantage that if ever there comes along another such header (probably, a non-standard header used by some badly-designed HTTP API that one of our customers really really needs to support), we can easily accommodate...
@youennf @yutakahirano @ddragana @whatwg/http thoughts?
I looks like we cannot read Set-Cookie if we once have set it, therefore I am for adding some function for this. getSetCookie sound good. I do not like getAll, because it sound like you can use it for every header field, but you cannot. If we want to make it generic getUnmergableHeader or something, but that is only for one header and it is unnecessary, so getSetCookie its fine.
cross-posting a suggestion from @nfriedly on #506 - use https://www.npmjs.com/package/set-cookie-parser#usage-in-react-native to parse the mangled cookie header
The fact that this inconsistency is still around is mind boggling to me
For future reference, this is how we solve this in Deno's fetch implementation:
Set-Cookie
headers does not concatenate headers, it will keep them seperate (both in the internal data structure, and on the wire)h.get("Set-Cookie")
returns a concatentated list of Set-Cookie
headers.h
(or using h.entries
, h.values
, or h.keys
) will result concatentated headers, as specified except for set-cookie
headers. These are returned as is, without concatenation. That means it is possible that h.entries
returns multiple items where the header name is set-cookie
.This was the solution of us trying to come up with a solution for multiple years. We deem this breaks the least existing code, while also being relatively easy to understand and use.
For the record, Cloudflare Workers went ahead with what I suggested in my last comment -- getAll()
, but it is only allowed for the specific header name Set-Cookie
.
I think what we should define here is a combination of https://github.com/whatwg/fetch/issues/973#issuecomment-561052450 and https://github.com/whatwg/fetch/issues/973#issuecomment-902578584. I.e., getSetCookie()
which returns a sequence and the iterator which is special cased for Set-Cookie
headers. The constructor already works. The internal data structure also already works.
Firefox is still supportive of adding this. Primarily to ensure that this data structure is actually suitable for all possible use cases.
@youennf @domenic thoughts?
I don't have any real thoughts here. The special iterator behavior seems a bit weird but I'm happy to believe https://github.com/whatwg/fetch/issues/973#issuecomment-902578584 when they say that they tried many things and found that to be the best.
I suspect @jasnell would also be interested in this issue.
Thanks @domenic ... definitely interested. I'll need to review a bit of the history before I can weigh in tho. I appreciate the heads up.
I am not against a dedicated getter within the Headers class. That said, I'd like to mention the potential complexity for implementors: Cache API as well as exposing a request to service worker fetch event handler might need specific handling to preserve Set-Cookie headers, depending on the actual underlying implementation. It is unclear to me how valuable an addition this is to the web platform (are we talking about setting Set-Cookie to request headers?).
That's a very good point. One potential solution might be to add Set-Cookie
to the forbidden header names, but this would have to be tested to see if it's web compatible. I agree that it's not desirable to have Request
or Response
instances that can contain Set-Cookie
headers in their Header
objects.
I agree that it's not desirable to have Request or Response instances that can contain Set-Cookie headers in their Header objects.
But only on the web! In server side environments like Deno and Cloudflare Workers, Response
objects are used to represent outgoing HTTP responses from the server to the client. These must be able to contain set-cookie
headers. On the web Response
objects containing a set-cookie
headers can already not be created, because Response
's headers use a "response"
header guard, which forbids "set-cookie"
headers. Both Deno and CFW work around this by not considering set-cookie
a forbidden response-header name.
For Request
objects I have no interest or opinion on set-cookie
headers being allowed. They are basically useless, and I don't think anyone would be hurt if the header were to be blocked for requests.
I would be strictly opposed to any changes to the header representation in the spec (even just in Response
objects) that would that would cause set-cookie
headers to be unsupportable by otherwise conformant implementations.
It is unclear to me how valuable an addition this is to the web platform.
There is no inherent value to the web browsers specifically: set-cookie
headers on responses are never exposed in the web platform, so users can not manually use these. The value lies in broader ecosystem compatibility and uniformity. Web browsers are just one of the implementors of the fetch spec: other JS runtimes use this API for HTTP fetches too (e.g. Deno, Cloudflare Workers, Node.js when using node-fetch
). If the API that all of these runtimes use is standardised and uniform, it becomes much easier to write code that is portable across runtimes. Reading and setting set-cookie
headers is a critical feature for server side HTTP implementations.
@lucacasonato sure, different flavor Request
/Response
objects could have a different Headers
guard. My comment was strictly about what is currently defined in Fetch.
If there's no value in this for the web, I suggest we not merge it into the spec or implement it in browsers.
@domenic I did not phrase that very elegantly. What I meant was that there is no value for the web if this change is looked at in total isolation, pretending that the Headers
class is implemented and used only in the browser (ignoring the existence of server side JS runtimes that have the Headers
class). This is obviously not the case though. Headers
are used in browsers, Deno, Cloudflare Workers, Node.js and various other server side runtimes.
It is correct that the primary users of this specific feature are going to be users of server side JS runtimes. Developers developing exclusively for the web are likely not going to use this. It is critically important however that the API behaves the same in server side runtimes, and browsers. Otherwise packages written for both will need to branch behaviour on the runtime, which is not great.
[...new Headers([["set-cookie": "foo"], ["set-cookie": "bar"]])]
should return the same result across browsers and server side runtimes. Not doing so is confusing, and will fragment the ecosystem.
That is the real value for browsers: keeping the API consistent between runtimes, and to avoid user confusion.
Thanks for the explanation. However, I still think then that the feature does not pull its weight.
@domenic What is your opinion on how server-side implementations should engage with web standards? Should we assume that whatwg is flatly uninterested in any feature that doesn't benefit browsers? I guess we should form our own working group that specifies a diff against the web standards to be applied on the server side?
@domenic Again, I agree with you if you were to look at the usage of Headers
in browsers in isolation. Consider this scenario in a "service worker @ edge" environment like Deno Deploy or Cloudflare Workers, that allow you to rewrite requests/responses in a server environment before they get forwarded onto the client. This example adds a Content-Security-Policy
header to all proxied requests:
addEventListener("fetch", (e) => {
e.respondWith(handler(e.req));
});
async function handler(req) {
const resp = await fetch(req); // forward the original request to the origin server
const headers = new Headers(resp.headers); // clone the headers object to strip off the `immutable` header guard
headers.set("Content-Security-Policy", "default-src 'none'; style-src 'self'; script-src 'self'");
return new Response(resp.body, { status: resp.status, headers });
}
On the surface it looks like this should just work for all cases. It is only 4 lines after all. The problem is that it doesn't work. If the origin server responds with some set-cookie
headers (which is pretty likely, as cookies are widely used), then the header cloning will mangle these set-cookie
headers by concatenating them. Instead of responding to the client with multiple set-cookie
headers, the client will now receive a single malformed cookie header.
With the current API as specified, there is no way to solve this. You can not just headers.get("set-cookie").split(", ")
, because ,
are valid characters in set-cookie headers.
As you can see, with this proposed change of altering the iterator, this code now works as expected. It solves a use-case which was previously not possible to solve with just features from the standard. Because of this we already have two incompatible extensions from Deno and Cloudflare that allow users to do this anyway. This is not sustainable - we need WHATWG to engage with other implementers that are not browser vendors - otherwise the ecosystem will diverge. This is not something that is in anyone's best interest IMO.
The WHATWG is working on standards for web browsers. To the extent other software wants to be compatible with web browsers, it's great to have their engagement. But if something does not sufficiently benefit web browser users then working on it elsewhere makes sense to me. It's totally up to them whether they want to be 100% compatible with a spec designed for web browsers, or add their own modifications and APIs. E.g. from what I understand most non-browser fetch implementations already ignore the CORS parts of the fetch spec and API. And they have non-WHATWG APIs for things like TCP/UDP. I don't know if those non-WHATWG APIs are standardized anywhere but I agree doing so would have sustainability benefits for server-side ecosystems avoiding divergence, if you are up for the extra investment.
Note that other browser engineers assessment of whether a proposal sufficiently benefits web browser users may be different than mine for any given proposal.
A key challenge is that end users don't really care whether the code is running in the browser, on the edge, or in a server, they would like the APIs to be consistent and for us not to have different APIs for the same things in different environments. So, while I definitely understand the position and WHATWG's mission, where there is known existing overlap in functionality/API between these different environments, it would be great to avoid diverging APIs. If necessary to make progress, we can get Node.js, Deno, Workers, and other backend/edge environments together to collaborate on these things but it would also be great to recognize that "the web" is not limited to just web browsers.
Yeah, I think this is worth doing, especially for iteration as it seems rather bad if that behaves differently depending on where you run the code. I also saw an argument by @domenic that Headers
is a generic data structure and therefore should have [Exposed=*]
which I think also argues for handling Set-Cookie
correctly.
Having said that, we need a better solution when Headers
's guard is "request
" as requiring a more complicated data structure in that case does not carry its weight I think, especially given that Set-Cookie
is a response header. The simplest solution would be making Set-Cookie
a forbidden header name, but that probably requires experimenting whether that breaks any unusual code out there.
that probably requires experimenting whether that breaks any unusual code out there.
It would make sense to do that experiment before proceeding with fetch spec changes.
Or getSetCookie API could be defined for non-browser-env only (say partial interface with [Exposed=non-browser-env]) in an extension spec, until we know it is safe to make Set-Cookie a forbidden header name on web environments. https://github.com/whatwg/fetch/pull/1346 could be split in two: infra part remaining in fetch spec, API going in an extension spec.
What is the conclusion here? I'd be happy to put in the work to get this landed, but I am not sure exactly what to do. My current understanding:
set-cookie
in Headers
objects.set-cookie
headers in the Cache API as the internal data structure might have to change.set-cookie
headers to be present in Request
(like is already the case in Response
) in browsers, because then this header would never be able to make it into a Request
.That suggests that the next step is to measure usage of set-cookie
headers in Headers
objects with a "request"
header guard. I expect this would mean adding a use counter in one of Chromium or Gecko. I have no idea how to do this: I tried doing it for Gecko just now, but I can not figure it out. I will also try for Chromium now. I'd appreciate any help on this front.
I have figured out how to add a UseCounter to Chromium for this. I'll be submitting a patch for it soon.
Added a UseCounter called "FetchSetCookieInRequestGuardedHeaders" to Chromium in https://chromium-review.googlesource.com/c/chromium/src/+/3423793. Data should be available here somewhat soon: https://chromestatus.com/metrics/feature/popularity#FetchSetCookieInRequestGuardedHeaders
Thanks @lucacasonato! The Cache API is a concern, but also transferring a request Headers
object to a service worker. I think your use counter addresses both of these.
(https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/collection/use-counters.html has documentation for Firefox by the way. #dom:mozilla.org on Matrix should be able to help if you have any trouble.)
Use counter results from Chromium are in: https://chromestatus.com/metrics/feature/timeline/popularity/4152
The % of pages that set a "set-cookie" header on an outbound fetch
request hovers around 0.0003%. The data indicates that two popularish sites that set a "set-cookie" header on request headers. These domains are:
For both of these sites the "set-cookie"
header is set on an outbound API request that uses fetch
. On the osgohome.com site, it is a syntactically invalid set-cookie
header.
The proposed changes in #1346 would make "set-cookie"
a forbidden header name, thus causing the cookie to be silently ignored (no explicit error is raised).
I have tested both sites using a puppeteer script using request interception that removes all outbound "set-cookie" headers (essentially what #1346 would do). Both sites continue to work just fine, and the API endpoints continue to return 200 status codes, identical data, and identical response headers.
As such I think we can safely make this change.
Am I correct that step one of what @lucacasonato proposed in https://github.com/whatwg/fetch/issues/973#issuecomment-1025227167 has now been completed and the impact of special handling set-cookie
is now low enough for this to be considered?
Was the conclusion to adopt an implementation like Deno's or to add a getSetCookie()
method to the Headers
object for this special case?
The latter, see #1346. It's currently stuck on getting implementer interest from two browsers.
FWIW, using getAll
would maintain consistency with other Web APIs such as URLSearchParams.getAll(). It's also what is being used by Cloudflare Workers as @kentonv mentioned above.
I think getSetCookie
is just fine - this comment is more to call out API consistency.
FWIW, using
getAll
would maintain consistency with other Web APIs such as URLSearchParams.getAll(). It's also what is being used by Cloudflare Workers as @kentonv mentioned above.I think
getSetCookie
is just fine - this comment is more to call out API consistency.
getAll
was removed from the specification because Webkit objected to it, since it would not be feasible for them to implement. As I see it, the only options for bringing getAll
back without them objecting would be:
Set-Cookie
. This would introduce implementation-defined behavior, which isn't a good practice for things like this. It would also be impossible to effectively communicate to developers that this difference between browsers is intentional.Set-Cookie
. This would mean the semantics of this method would be different from the previously defined getAll
method. Furthermore, if server-side implementations like Cloudflare Workers switched to it, the change might possibly lead to code breaking in ways much harder to debug than if the method was removed.In my opinion, both of these options are bad, and getSetCookie
is much better than either.
FWIW Cloudflare Workers' implementation of getAll()
throws an exception if the header name is anything other than Set-Cookie
.
However, the ship has clearly sailed here, getSetCookie
is the standard that has been written and agreed upon.
what about X-Robots-Tag for example where rules specified without a user agent are valid for all
console.log(
new Headers([
['x-robots-tag', 'noindex'],
['x-robots-tag', 'unavailable_after: 25 Jun 2099 15:00:00'],
['x-robots-tag', 'somebot: nofollow'],
['x-robots-tag', 'otherbot: noindex, nofollow'],
['x-robots-tag', 'noarchive'],
]).get('x-robots-tag')
==
'noindex, unavailable_after: 25 Jun 2099 15:00:00, somebot: nofollow, otherbot: noindex, nofollow, noarchive'
)
now is the noarchive rule for all or just for otherbot ?!
If you cannot split on comma, then X-Robots-Tag
violates HTTP.
If you cannot split on comma, then
X-Robots-Tag
violates HTTP.
where in the HTTP spec does it say that ?
Since https://github.com/whatwg/fetch/commit/42464c8c3d2fd3437a19fc6afd2438a0fd42dde8
Headers.prototype.getAll
has been deprecated/removed from the spec and implementations.I understand that in browsers (including service workers) filtered responses don't include headers that could benefit from the getAll method. However, some JavaScript environment doesn't need to filter response/request headers, like serverless plateforms (for instance Cloudflare Workers) or maybe Nodejs at some point.
For example editing Cookie headers:
Spliting the combined header value by
,
will give an invalid result. InsteadgetAll
could be used to retrieve the individual headers.