Closed coletdjnz closed 3 months ago
Thanks for the detailed investigation. If I understand correctly, are these the changes we need?
- subdomains=cookie.domain_initial_dot,
+ subdomains=cookie.domain_specified,
- domain_specified=True,
+ domain_specified=self.subdomains,
Thanks for the detailed investigation. If I understand correctly, are these the changes we need?
- subdomains=cookie.domain_initial_dot, + subdomains=cookie.domain_specified, - domain_specified=True, + domain_specified=self.subdomains,
I believe so
Fixed in 0.7.1.
Describe the bug
The
subdomains
attribute of a CurlMorsel is currently being set to the value ofCookie.domain_initial_dot
, however I don't believe this is correct. Instead, this should be set from the value ofCookie.domain_specified
.https://github.com/yifeikong/curl_cffi/blob/630a4dcfc24a73ace0e71d364aae1457cebc3fc0/curl_cffi/requests/cookies.py#L85-L95
From my understanding of the standards (RFC6265), if the
Domain
attribute is not present in theSet-Cookie
header, then the cookie domain is the request domain and must only be applied to that domain, and not subdomains (subdomains=False
)If the
Domain
attribute is present then the domain matches that domain plus any subdomains (subdomains=True
)https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#domaindomain-value
The
Cookie.domain_specified
is used to distinguish this behaviour.https://docs.python.org/3/library/http.cookiejar.html#http.cookiejar.Cookie.domain_specified
Additionally, the parsing of CurlMorsel to Python Cookie also suffers from the same issue in reverse.
domain_specified
should be mapped tosubdomains
https://github.com/yifeikong/curl_cffi/blob/630a4dcfc24a73ace0e71d364aae1457cebc3fc0/curl_cffi/requests/cookies.py#L97-L120
This is how python sets it from a response: https://github.com/python/cpython/blob/d6555abfa7384b5a40435a11bdd2aa6bbf8f5cfc/Lib/http/cookiejar.py#L1526
Whether the
Domain
contains an initial dot is irrelevant (at least in RFC6256), so domain_initial_dot should be ignored.For more confidence, curl also maps the second column in a cookiefile to
subdomains
, of which in Python's version it maps todomain_specified
https://curl.se/docs/http-cookies.html#cookie-file-format https://github.com/python/cpython/blob/d6555abfa7384b5a40435a11bdd2aa6bbf8f5cfc/Lib/http/cookiejar.py#L2044To Reproduce
This is causing an issue on our end where a request made with Urllib to
subdomain.example.com
gets a response withSet-Cookie: Domain=example.com
(domain_initial_dot=False, domain_specified=True). The cookiejar is shared with curl_cffi, of which when a request is subsequently made tosubdomain2.example.com
with curl_cffi, the cookie is not added because curl_cffi has read the cookie incorrectly and setsubdomains=False
.Versions
Additional context https://github.com/yt-dlp/yt-dlp/issues/10438 https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.2.3 https://datatracker.ietf.org/doc/html/rfc6265#section-5.3