aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.1k stars 2.02k forks source link

[Client] Double Set-Cookie header management #4486

Open kyriog opened 4 years ago

kyriog commented 4 years ago

Long story short

In my use case, I have a webserver (on which I have no control) which sends me, in a single request, two Set-Cookie headers with the same cookie name. The first one contains an empty value and a expires header in the past, to remove the cookie, and a second one which define a value, but without header, to define it as a session cookie.

The problem is that in the response, the two cookies are "mixed" in a single one with the value from the second Set-Cookie and the expires from the first one. Which makes the cookie deleted by the cookie jar in the session, as Expires is in the past.

Expected behaviour

Response and session cookie jar should contain the second cookie (at least).

Actual behaviour

Response contains a cookie with the value of the second cookie, but the expires from the first one. In the session, the cookie is removed.

Steps to reproduce

  1. Find a server which send a cookie twice, or use simple gist example.
  2. Use aiohttp to send a request to that URL
  3. Observe wrong cookie content

Your environment

aiohttp==3.6.2

Workaround

From what I found, this seems to be caused by how SimpleCookie loads cookies. If you load the same cookie name twice, the two cookies are merged in a single one. As I don't know if this is an expected behaviour from http.cookies.BaseCookie, I tried to simply remove the cookie before adding the new one:

        # cookies
        for hdr in self.headers.getall(hdrs.SET_COOKIE, ()):
            try:
                cookie_name = hdr.split('=', 1)[0]
                if cookie_name in self.cookies:
                    dict.__delitem__(self.cookies, cookie_name)
                self.cookies.load(hdr)
            except CookieError as exc:
                client_logger.warning(
                    'Can not load response cookies: %s', exc)

This is not a full fix of this problem, as the problem may occur with something different than the Expires header, but it should prevent wrong "mixed" cookies.

hh-h commented 4 years ago

Servers SHOULD NOT include more than one Set-Cookie header field in the same response with the same cookie-name.

https://tools.ietf.org/html/rfc6265#section-4.1.1

alandtse commented 4 years ago

Servers SHOULD NOT include more than one Set-Cookie header field in the same response with the same cookie-name.

https://tools.ietf.org/html/rfc6265#section-4.1.1

I have run into this issue where Amazon international servers (.co.uk, .es, .it) are not complying with the referenced RFC .

A get request to:

https://alexa.amazon.es

which redirects to

https://www.amazon.es/ap/signin?showRmrMe=1&openid.return_to=https://alexa.amazon.es/&openid.identity=http://specs.openid.net/auth/2.0/identifier_select&openid.assoc_handle=amzn_dp_project_dee_es&o enid.mode=checkid_setup&openid.claimed_id=http://specs.openid.net/auth/2.0/identifier_select&openid.ns=http://specs.openid.net/auth/2.0&

Results in a headers that sets the session-id and session-id-time for different domains (one that is expired):

<CIMultiDictProxy('Content-Type': 'text/html;charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Server': 'Server', 'Date': 'Sat, 13 Jun 2020 09:47:52 GMT', 'X-XSS-Protection': '1', 'X-Content-Type-Options': 'nosniff', 'x-ua-compatible': 'IE=edge', 'Pragma': 'No-cache', 'Cache-Control': 'max-age=0, no-cache, no-store, must-revalidate', 'Expires': 'Thu, 01 Jan 1970 00:00:00 GMT', 
'Set-Cookie': 'ap-fid=""; Domain=.amazon.es; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/ap/; Secure', 
'Set-Cookie': 'session-id=260-4126947-2577557; Domain=.amazon.es; Expires=Sun, 13-Jun-2021 09:47:52 GMT; Path=/; Secure', 
'Set-Cookie': 'session-id-time=2222761672l; Domain=.amazon.es; Expires=Sun, 13-Jun-2021 09:47:52 GMT; Path=/; Secure', 
'Set-Cookie': 'session-id=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'session-id-time=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'session-token=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'ubid-acbes=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 'Set-Cookie': 'at-acbes=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'lc-acbes=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'x-acbes=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'x-wl-uid=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'sess-at-acbes=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 
'Set-Cookie': 'UserPref=-; path=/; domain=.www.amazon.es; expires=Fri, 13-Jun-2008 09:47:52 GMT', 'Strict-Transport-Security': 'max-age=47474747; includeSubDomains; preload', 'Vary': 'Content-Type,Accept-Encoding,X-Amzn-CDN-Cache,X-Amzn-AX-Treatment,User-Agent', 'x-amz-rid': 'DVNFAQW8A07JA0NC5EGS', 'X-Frame-Options': 'SAMEORIGIN', 'X-Cache': 'Miss from cloudfront', 'Via': '1.1 96abbf138436a1c4a82006a53fa43b20.cloudfront.net (CloudFront)', 'X-Amz-Cf-Pop': 'LAX3-C3', 'X-Amz-Cf-Id': 'ms8YQr 19yJNwOTwqZ4tpBs0tjpSQu2rBbApau-NTTBBhCi7SmrJbw==')>

Given the RFC is a recommendation should not and not a requirement must not, it seems like this behavior should be addressed.

I'll see about a local fix for my use case and will see about a PR. However, I may not prioritize upstraming as the component I'm working is a plug-in for HomeAssistant where 3.6.2 and above causes breaking changes #4258.

alandtse commented 4 years ago

After looking at this closer, I think the issue is ClientResponse.cookies should not be a SimpleCookie since it will clobber any duplicate names, which I believe is incorrect behavior. It should faithfully pass the full list on so that the CookieJar or other functions can decide how to handle the multiple cookies with the same name. If that means passing the set-cookie string unchanged, that may be the most accurate way.

The top answer in StackOverflow for this question points to this website which states clients should only delete cookies that match the domain, path, and name.

A cookie can only be overwritten (or deleted) by a subsequent cookie exactly matching the name, path and domain of the original cookie. Even though a cookie with domain “.example.org” set by www.example.org is perfectly valid, it will not overwrite a previous cookie of the same name which was set against “www.example.org”. Instead, both cookies will be stored, and on subsequent requests only one will be sent.

This matches behavior I am seeing on Chrome and Firefox which will process cookies with the same name separately and in this case will keep the session-id and session-id-time that isn't set to expire.

alandtse commented 4 years ago

I have a potential fix but only for my client-side use case. It essentially changes response.cookies from SimpleCookie to [(name, morsel)]. This automatically is handled correctly by cookie_jar.update_cookies(). While I have fixed all tests that assumed it was a SimpleCookie, I have not added a test for the specific issue of receiving multiple cookies and properly updating response.cookies so it's probably not ripe for PR.