Open njsmith opened 4 years ago
@njsmith Out of curiosity, what exactly is the bug in the RFC 7230 definition, and why is the definition obviously wrong?
The spec accidentally disallows any header value that contains a single character word inside it. For example, this is not a valid header
would be an illegal header value, because the word a
is only one character long.
Header values are a mess. Supposedly they're defined by RFC 7230, but in fact it has a bug and its definition is obviously wrong. And, in practice, implementations are substantially more lax than RFC 7230, even after you fix the obvious bug.
In #57/#68, we adjusted our validation rule to allow more characters, based on some intuition and a small amount of new data (e.g. we allow
\x01
, which is used by google analytics cookies, but still disallow\x00
).But, it turns out that the WHAT-WG fetch spec has an actual precise definition for header values: https://fetch.spec.whatwg.org/#concept-header-value
Weird that it's here instead of in some HTTP spec, but I'll take it.
I think there are two differences between what h11 does currently and the WHAT-WG spec:
\v
) and form-feed (\f
), which are obscure line-breaking whitespace characters. They only disallow\r
and\n
.We should probably switch to matching the WHAT-WG behavior exactly.