Open moonsikpark opened 6 days ago
After further investigation, it appears the module exhibits inconsistent behavior when encountering unexpected inputs and it is likely due to its regex-based parsing, which assumes very clean input.
If the maintainers are in agreement, I’d like to propose a patch to improve cookie parsing by following these steps:
Additionally, I’d like to gather opinions on whether we should allow invalid—but commonly occurring—characters in cookie values, such as those used in JSON. While the RFC advises against accepting these characters, they are widely accepted by major browsers and have become a common practice among web developers.
@picnixz, it seems you triaged my issue. If possible, could you notify the appropriate core maintainers?
cc @serhiy-storchaka
Bug report
Bug description:
There are several issues with
http.cookies.SimpleCookie.load()
that deviate from current browser behavior:Consider the cookie
a=b;c=d\x09d;e=f
. Thee
value contains\x09
, which is not allowed per RFC 6265, Section 4.1.1.When this is sent to a browser (Chrome 130), the browser processes all valid cookies and filters out invalid ones:
Resulting behavior:
However,
http.cookies.SimpleCookie.load()
ignores the entire cookie string:Consider the cookie
a=b;c={"d":"e"};f=g
. Thec
value is invalid per RFC 6265, Section 4.1.1.Browsers process this cookie without an issue:
Resulting behavior:
However,
http.cookies.SimpleCookie.load()
processes only the valid portion before the malformed cookie and stops entirely:It seems we should ensure consistent handling by (a) processing all valid cookies and discarding only invalid ones, or (b) rejecting the entire cookie string if any invalid cookie is present.
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response