aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.11k stars 2.02k forks source link

Allow RFC 2822-compliant dates for the `expires` cookie directive #4493

Open Afoucaul opened 4 years ago

Afoucaul commented 4 years ago

Long story short

I was using a ClientSession with an API that replied with Set-Cookie, but the date format use for expires was not RFC 2616-compliant, but RFC 2822-compliant: the timezone was not GMT, but -0000. More concretely, the following cookie fails:

hello=world; expires=Wed, 15 Jan 2020 09:45:07 -0000

While this one works:

hello=world; expires=Wed, 15 Jan 2020 09:45:07 GMT

Expected behaviour

I expect that a cookie set via Set-Cookie and with an expires field compliant with RFC 2822 is correctly set in the session's cookies.

Actual behaviour

The cookie is passed to http.cookies.SimpleCookie, and at one point goes through the _CookiePattern regex that explicitly requires date fields to end with GMT.

http.cookies:434:

_CookiePattern = re.compile(r"""
    \s*                            # Optional whitespace at start of cookie
    (?P<key>                       # Start of group 'key'
    [""" + _LegalKeyChars + r"""]+?   # Any word of at least one letter
    )                              # End of group 'key'
    (                              # Optional group: there may not be a value.
    \s*=\s*                          # Equal Sign
    (?P<val>                         # Start of group 'val'
    "(?:[^\\"]|\\.)*"                  # Any doublequoted string
    |                                  # or
    \w{3},\s[\w\d\s-]{9,11}\s[\d:]{8}\sGMT  # Special case for "expires" attr
    |                                  # or
    [""" + _LegalValueChars + r"""]*      # Any word or empty string
    )                                # End of group 'val'
    )?                             # End of optional value group
    \s*                            # Any number of spaces.
    (\s+|;|$)                      # Ending either at space, semicolon, or EOS.
    """, re.ASCII | re.VERBOSE)    # re.ASCII may be removed if safe.

Line 444:

\w{3},\s[\w\d\s-]{9,11}\s[\d:]{8}\sGMT  # Special case for "expires" attr

Steps to reproduce

Please see this StackOverflow post where I include an MWE showing the behaviour, as well as an MWE with requests which shows the expected behaviour.

Your environment

$ python --version                                      
Python 3.7.4
$ python -c "import aiohttp; print(aiohttp.__version__)"
3.6.2
hh-h commented 4 years ago

Hi!

I expect that a cookie set via Set-Cookie and with an expires field compliant with RFC 2822 is correctly set in the session's cookies.

Why do you expect that? For what I've googled, cookies should follow RFC 6265 and there's no information that the Expires should follow RFC 2822 date.

plotski commented 4 years ago

I think I'm running into the same issue. Here's some demo code:

import asyncio

async def main():

    import requests
    session = requests.Session()
    response = session.get('https://imgbox.com/')
    print('requests headers:', response.headers)
    print('requests cookies:')
    for k, v in response.cookies.items():
        print(k, v)

    import aiohttp
    async with aiohttp.ClientSession() as session:
        async with session.get('https://imgbox.com/') as response:
            print('aiohttp headers:', response.headers)
        print('aiohttp cookies:')
        for c in session.cookie_jar:
            print(c)

asyncio.run(main())

The headers contain -0000 as the time zone. requests sets the cookie as expected while aiohttp doesn't.

Here's a quote from https://tools.ietf.org/html/rfc2616#section-3.3.1 (via https://tools.ietf.org/html/rfc6265#section-4.1.1):

Recipients of date values are encouraged to be robust in
accepting date values that may have been sent by non-HTTP
applications, as is sometimes the case when retrieving or posting
messages via proxies/gateways to SMTP or NNTP.

It took me many hours to figure out why the cookie jar was empty and was so frustrated that I was close to sticking with requests.

Would it be acceptable to parse non-compliant dates?

vr-hunter commented 2 years ago

I ran into this problem and fixing it gave me quite a headache.

Since I was working with a 3rd party API, I had to circumvent the problem. I solved it like this:

For the record (and in response to @plotski's last question): I think that such dates should be parsed because that's what all the browsers and most other libraries do.