Open annevk opened 3 years ago
Ideally the way this would be specified is by sharing logic with document.cookie
, FWIW. (Something I also raised before.)
Thanks and +1. @bsittler did substantial investigation into cross-browser and API/header behavior with encoding, but I'm not sure that research was retained and was definitely not reflected into the spec.
We should audit test coverage and get coverage/spec updates.
(Sharing w/ HTML also SGTM. PRs/concrete suggestions welcome!)
FYI, here's the current behavior in Chrome:
cookie_test(async t => {
const value = 'value\u00C6'; // Æ
const utf8_value = 'value\xC3\x86'; // UTF-8 encoding
const non_utf8_value = 'value\xC6'; // ISO-8859-1 encoding
await setCookieBinaryHttp(`utf8=${utf8_value}; path=/`);
assert_equals(await getCookieBinaryHttp(), `utf8=${utf8_value}`);
// Set a non-UTF-8 cookie on the server, verify the server thinks it's there.
await setCookieBinaryHttp(`nonutf8=${non_utf8_value}; path=/`);
assert_equals(await getCookieBinaryHttp(), `utf8=${utf8_value}; nonutf8=${non_utf8_value}`);
// document.cookie vs. mixed cookie encodings
assert_equals(await getCookieStringDocument(), undefined,
'document.cookie yields nothing if non-UTF-8 cookies served');
// Cookie Store API vs. mixed cookie encodings
const cookies = await cookieStore.getAll();
assert_equals(cookies.length, 2, 'Both cookies are present by name');
assert_not_equals(
cookies.find(c => c.name === 'utf8'), undefined, 'utf8 cookie present');
assert_equals(
cookies.find(c => c.name === 'utf8').value, value,
'utf8 cookie value should be preserved');
assert_not_equals(
cookies.find(c => c.name === 'nonutf8'), undefined,
'nonutf8 cookie present');
assert_equals(
cookies.find(c => c.name === 'nonutf8').value, '',
'nonutf8 cookie value is empty');
}, 'Non-UTF-8 cookies dropped by browser and not reflected by API');
Appears to be:
document.cookie
yields nothing if any of the cookies have non-UTF-8 value (!)I should also try non-UTF-8 cookie names.
Names:
cookie_test(async t => {
const name = 'name\u00C6'; // Æ
const utf8_name = 'name\xC3\x86'; // UTF-8 encoding
const non_utf8_name = 'name\xC6'; // ISO-8859-1 encoding
await setCookieBinaryHttp(`${utf8_name}=value1; path=/`);
assert_equals(await getCookieBinaryHttp(), `${utf8_name}=value1`);
// Set a non-UTF-8 cookie on the server, verify the server thinks it's there.
await setCookieBinaryHttp(`${non_utf8_name}=value2; path=/`);
assert_equals(
await getCookieBinaryHttp(),
`${utf8_name}=value1; ${non_utf8_name}=value2`);
// document.cookie vs. mixed cookie encodings
assert_equals(await getCookieStringDocument(), undefined,
'document.cookie yields nothing if non-UTF-8 cookies served');
// Cookie Store API vs. mixed cookie encodings
const cookies = await cookieStore.getAll();
assert_equals(cookies.length, 2, 'Both cookies are present');
assert_not_equals(
cookies.find(c => c.name === name), undefined, 'utf8 cookie present');
assert_equals(
cookies.find(c => c.name === name).value, 'value1',
'utf8 cookie value should be preserved');
assert_not_equals(
cookies.find(c => c.name === ''), undefined, 'nonutf8 cookie name empty');
assert_equals(
cookies.find(c => c.name === '').value, 'value2',
'nonutf8 cookie value should be preserved');
}, 'Non-UTF-8 cookie names dropped by browser and not reflected by API');
ISTM this behavior would be described (as @annevk mentions) by changing the decode definition to be:
To decode a value, run UTF-8 decode without BOM or fail on value, returning the result if not failure, or the empty string otherwise.
Thanks for playing with this! I think consistently excluding is the way to go, but willing to be convinced of better alternatives.
Seems like 3 options:
I think 1 is weird and not intended behavior. I'm optimistic that it's web-compatible to move away from that. Between 2 or 3, any preference?
(Integration between this spec and the RFC is not incredibly rigorous at the moment; I could imagine one approach that takes the cookie-string output of the RFC and re-parses it, which could involve attempting to decode the whole string and therefore option 3 "falls out"? That also seems a bit weird, though.)
Oh, I didn't realize document.cookie
excluded all. That does seem weird, yeah.
I think I raised this before, but I don't really see text about it in the specification. If you limit yourself to UTF-8, you need to deal with cookies that do not decode as UTF-8, presumably by dropping them as Chrome appears to do for
document.cookie
. That should be specified somehow.I think that means cookies need to use "UTF-8 decode without BOM or fail" from Encoding.