httpie / cli

🥧 HTTPie CLI — modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more.
https://httpie.io
BSD 3-Clause "New" or "Revised" License
33.37k stars 3.67k forks source link

Multiple test failures in tox py34 #278

Closed msabramo closed 9 years ago

msabramo commented 9 years ago

Multiple test failures in test_sessions.py.

When I run:

tox -e py34 -- tests/test_sessions.py

I get 1, 2, or 3 test failures out of the 7 total tests in that module.

tests/test_sessions.py::TestSessionFlow::test_session_created_and_reused PASSED
tests/test_sessions.py::TestSessionFlow::test_session_update FAILED
tests/test_sessions.py::TestSessionFlow::test_session_read_only FAILED
tests/test_sessions.py::TestSession::test_session_ignored_header_prefixes PASSED
tests/test_sessions.py::TestSession::test_session_by_path PASSED
tests/test_sessions.py::TestSession::test_session_unicode FAILED
tests/test_sessions.py::TestSession::test_session_default_header_value_overwritten PASSED

More info:

TestSessionFlow.test_session_update fails with:

...
requests.exceptions.ConnectionError: (
'Connection aborted.', ConnectionResetError(54, 'Connection reset by peer'))

pdb for above error shows:

> /Users/marca/dev/git-repos/httpie/.tox/py34/lib/python3.4/site-packages/requests/adapters.py(407)send()
-> raise ConnectionError(err, request=request)
(Pdb) up
> /Users/marca/dev/git-repos/httpie/.tox/py34/lib/python3.4/site-packages/requests/sessions.py(569)send()
-> r = adapter.send(request, **kwargs)
(Pdb) request.method
'GET'
(Pdb) request.url
'http://127.0.0.1:56308/cookies'
(Pdb) request.headers
{'Accept-Encoding': 'gzip, deflate', 'Hello': b'World2', 'Cookie': 'hello=world; hello=world2', 'Connection': 'keep-alive', 'Authorization': b'Basic dXNlcm5hbWU6cGFzc3dvcmQy', 'Accept': '*/*', 'User-Agent': b'HTTPie/0.9.0-dev'}
(Pdb) kwargs
{'timeout': 30, 'cert': None, 'proxies': {}, 'stream': True, 'verify': True}

TestSessionFlow.test_session_read_only fails with

...
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))

pdb for above error shows:

> /Users/marca/dev/git-repos/httpie/.tox/py34/lib/python3.4/site-packages/requests/adapters.py(407)send()
-> raise ConnectionError(err, request=request)
(Pdb) up
> /Users/marca/dev/git-repos/httpie/.tox/py34/lib/python3.4/site-packages/requests/sessions.py(569)send()
-> r = adapter.send(request, **kwargs)
(Pdb) request.method
'GET'
(Pdb) request.url
'http://127.0.0.1:56276/cookies'
(Pdb) request.headers
{'User-Agent': b'HTTPie/0.9.0-dev', 'Hello': b'World2', 'Accept-Encoding': 'gzip, deflate', 'Cookie': 'hello=world; hello=world2', 'Connection': 'keep-alive', 'Authorization': b'Basic dXNlcm5hbWU6cGFzc3dvcmQy', 'Accept': '*/*'}
(Pdb) kwargs
{'cert': None, 'proxies': {}, 'verify': True, 'timeout': 30, 'stream': True}

TestSession.test_session_unicode fails with:

...
Traceback (most recent call last):
  File ".../httpie/tests/test_sessions.py", line 148, in test_session_unicode
    assert (r2.json['headers']['Authorization']
KeyError: 'Authorization'

Pertaining to this last error, there is a comment in the test saying:

147             # FIXME: Authorization *sometimes* is not present on Python3
(Pdb) pprint.pprint(r2.json)
{'args': {},
 'headers': {'Content-Length': '',
             'Host': '127.0.0.1:56230',
             'Test': '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï'},
 'origin': '127.0.0.1',
 'url': 'http://127.0.0.1:56230/get'}

In py33 I also see 1 to 2 test failures -- I have not yet observed TestSession.test_session_unicode failing on py33.

Most of the time, all these tests pass on py27, though I am seeing test_session_read_only fail occasionally with:

____________________________________________________________________ TestSessionFlow.test_session_read_only ____________________________________________________________________
Traceback (most recent call last):
  File "/Users/marca/dev/git-repos/httpie/tests/test_sessions.py", line 82, in test_session_read_only
    self.start_session(httpbin)
  File "/Users/marca/dev/git-repos/httpie/tests/test_sessions.py", line 48, in start_session
    env=self.env())
  File "/Users/marca/dev/git-repos/httpie/tests/utils.py", line 136, in http
    exit_status = main(args=args, **kwargs)
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/httpie/core.py", line 112, in main
    response = get_response(args, config_dir=env.config.directory)
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/httpie/client.py", line 31, in get_response
    read_only=bool(args.session_read_only),
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/httpie/sessions.py", line 65, in get_response
    response = requests_session.request(**requests_kwargs)
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/requests/sessions.py", line 457, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/requests/sessions.py", line 595, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/requests/sessions.py", line 189, in resolve_redirects
    allow_redirects=False,
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/requests/sessions.py", line 569, in send
    r = adapter.send(request, **kwargs)
  File "/Users/marca/dev/git-repos/httpie/.tox/py27/lib/python2.7/site-packages/requests/adapters.py", line 407, in send
    raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', error(54, 'Connection reset by peer'))
msabramo commented 9 years ago

I can reproduce the test_session_unicode failure consistently by explicitly passing a --hashseed to tox:

❯ tox -e py34 --hashseed=1811760512 -- tests/test_sessions.py -k test_session_unicode
GLOB sdist-make: /Users/marca/dev/git-repos/httpie/setup.py
py34 inst-nodeps: /Users/marca/dev/git-repos/httpie/.tox/dist/httpie-0.9.0-dev.zip
py34 runtests: PYTHONHASHSEED='1811760512'
py34 runtests: commands[0] | py.test --verbose --doctest-modules --basetemp=/Users/marca/dev/git-repos/httpie/.tox/py34/tmp tests/test_sessions.py -k test_session_unicode
============================================================================= test session starts ==============================================================================
platform darwin -- Python 3.4.0 -- py-1.4.26 -- pytest-2.6.4 -- /Users/marca/dev/git-repos/httpie/.tox/py34/bin/python3.4
plugins: httpbin
collected 6 items

tests/test_sessions.py::TestSession::test_session_unicode FAILED

=================================================================================== FAILURES ===================================================================================
_______________________________________________________________________ TestSession.test_session_unicode _______________________________________________________________________
Traceback (most recent call last):
  File "/Users/marca/dev/git-repos/httpie/tests/test_sessions.py", line 151, in test_session_unicode
    assert (r2.json['headers']['Authorization']
KeyError: 'Authorization'
----------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------
127.0.0.1 - - [29/Nov/2014 11:40:28] "GET /get HTTP/1.1" 200 301
127.0.0.1 - - [29/Nov/2014 11:40:28] "GET /get HTTP/1.1" 200 301
================================================================ 5 tests deselected by '-ktest_session_unicode' ================================================================
============================================================== 1 failed, 5 deselected, 1 warnings in 0.67 seconds ==============================================================
ERROR: InvocationError: '/Users/marca/dev/git-repos/httpie/.tox/py34/bin/py.test --verbose --doctest-modules --basetemp=/Users/marca/dev/git-repos/httpie/.tox/py34/tmp tests/test_sessions.py -k test_session_unicode'
___________________________________________________________________________________ summary ____________________________________________________________________________________
ERROR:   py34: commands failed
msabramo commented 9 years ago

If I change to --hashseed=1811760511, then test_session_unicode passes every time.

msabramo commented 9 years ago

With pdb, I can see that the HTTP_AUTHORIZATION isn't even present in the WSGI environment:

> /Users/marca/dev/git-repos/httpie/.tox/py34/lib/python3.4/site-packages/werkzeug/wrappers.py(528)headers()
-> return EnvironHeaders(self.environ)
(Pdb) self.environ['HTTP_AUTHORIZATION']
*** KeyError: 'HTTP_AUTHORIZATION'
jkbrzt commented 9 years ago

Good detective work :) It's a bit of a mysterious bug. Any idea what the root cause could be?

msabramo commented 9 years ago

Not sure but I think there's a bug in Python 3.4's HTTP header parsing. Check this out:

This in inside wsgiref when it has received a request with an Authorization header.

> /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wsgiref/simple_server.py(104)get_environ()
-> for k, v in self.headers.items():
(Pdb) str(self.headers)
'Host: 127.0.0.1:61463\nUser-Agent: HTTPie/0.9.0-dev\nAccept: */*\nTest: =?utf-8?b?W29uZSBsaW5lIG9mIFVURjgtZW5jb2RlZCB1bmljb2RlIHRleHRdIMOPwofDj8KBw48=?=\n\nÏ\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ\x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85\nÙ\x80Ù\x8eبÙ\x86Ù\x90\nAccept-Encoding: gzip, deflate\nConnection: keep-alive\nAuthorization: Basic dGVzdDpbb25lIGxpbmUgb2YgVVRGOC1lbmNvZGVkIHVuaWNvZGUgdGV4dF0gz4fPgc+Fz4POsc+G4b22IOWkqumZvSDguYDguKXguLTguKgg4pmc4pme4pmd4pmb4pma4pmd4pme4pmcINC+0LbQuNCy0LvRkdC90L3Ri9C8IOCkpOCkvuCkqOCljeCkr+CkueCkvuCkqOCkvyDmnInmnIsg4K644K+N4K6x4K+A4K6p4K6/4K614K6+4K64INmx2YTYsdmO2ZHYrdmS2YXZgNmO2KjZhtmQ\n\n'
(Pdb) self.headers.items()
[('Host', '127.0.0.1:61463'), ('User-Agent', 'HTTPie/0.9.0-dev'), ('Accept', '*/*'), ('Test', '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85')]

Note that you can see the Authorization header in the output of str(self.headers), but it's not showing up in self.headers.items().

msabramo commented 9 years ago

I am suspicious of the Test header:

('Test', '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85')]

That Test header is the last one that shows up in self.headers.items(); no header that occurs after it appears -- e.g.: Accept-Encoding, Connection, Authorization

Also the the value is very short so I suspect that parsing is failing midway through and messing up the processing of all subsequent headers.

msabramo commented 9 years ago

I'm pretty sure that it's choking somewhere in email/feedparser.py in the stdlib. Some may say "email? WTF", but the stdlib's http.client.HTTPMessage class (used for HTTP headers) is subclassed from email.message.Message (!).

There's even a "defect" recorded. The email parser mentions in its comments that it doesn't throw exceptions, it records defects instead.

> /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wsgiref/simple_server.py(104)get_environ()
-> for k, v in self.headers.items():
(Pdb) self.headers
<http.client.HTTPMessage object at 0x106612668>
(Pdb) self.headers.defects
[MissingHeaderBodySeparatorDefect()]

It's dubious of whether that Test header needs to be there at all though.

msabramo commented 9 years ago

Oh I guess the test is putting the Test header in there to test the handling of Unicode and maybe it's not working :smile:

jkbrzt commented 9 years ago

Yea, that's the reason behind the header. Interestingly it only fails within a session, but not when tested outside one like here: https://github.com/jakubroztocil/httpie/blob/04819577154fc1b11fc20ae7ac584d67614eca25/tests/test_unicode.py#L12

msabramo commented 9 years ago

I am finding a lot of info online that says that you can only use ISO-8859-1 in HTTP headers. So UTF-8 could very well be breaking things.

msabramo commented 9 years ago

Maybe test_session_unicode should simply be removed?

jkbrzt commented 9 years ago

So based on your findings the root cause seems to be that the code in email/feedparser.py chokes on the unicode headers. And the reason why it happens only sometimes is because Python dict, where the request headers are stored, is unordered. So, if the Authorization header comes after Test when it's being serialized (such as when you pass --hashseed=1811760512), it doesn't get parsed correctly at the server side and is therefore missing from httpbin's response.

test_session_unicode is quite useful so I would like to keep it, but it should be modified so that we don't run into this issue anymore.

sigmavirus24 commented 9 years ago

I suspect one of the bytes in your unicode data is just plain confusing to the parser. You should check the bytes after they've been encoded

msabramo commented 9 years ago

From http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2

The TEXT rule is only used for descriptive field contents and values that are not intended to be interpreted by the message parser. Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].

msabramo commented 9 years ago

I wonder if it's been failing to parse that Unicode header in all Python versions but you only see the test failure in Python 3 because of the hash randomization causing it to not get the Authorization

So perhaps if that test is kept, it needs to be beefed up to more stringently check the Test header?

msabramo commented 9 years ago

Here is what gets received, just before parsing:

> /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py(272)parse_headers()-><http.client....t 0x105e0f6a0>
-> return email.parser.Parser(_class=_class).parsestr(hstring)
(Pdb) hstring
'Host: 127.0.0.1:63531\r\nUser-Agent: HTTPie/0.9.0-dev\r\nAccept: */*\r\nTest: [one line of 
UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85Ï\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ
\x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ
\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯
\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85Ù\x80Ù\x8eبÙ\x86Ù
\x90\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nAuthorization: Basic
 dGVzdDpbb25lIGxpbmUgb2YgVVRGOC1lbmNvZGVkIHVuaWNvZGUgdGV4dF0gz4fPgc+Fz4POsc+
G4b22IOWkqumZvSDguYDguKXguLTguKgg4pmc4pme4pmd4pmb4pma4pmd4pme4pmcINC+0LbQu
NCy0LvRkdC90L3Ri9C8IOCkpOCkvuCkqOCljeCkr+CkueCkvuCkqOCkvyDmnInmnIsg4K644K+N4K6x
4K+A4K6p4K6/4K614K6+4K64INmx2YTYsdmO2ZHYrdmS2YXZgNmO2KjZhtmQ\r\n\r\n'
(Pdb) hstring.encode('unicode_escape')
b'Host: 127.0.0.1:63531\\r\\nUser-Agent: HTTPie/0.9.0-dev\\r\\nAccept: */*\\r\\nTest: [one line of 
UTF8-encoded unicode text] \\xcf\\x87\\xcf\\x81\\xcf\\x85\\xcf\\x83\\xce\\xb1\\xcf\\x86\\xe1\\xbd\\xb6
 \\xe5\\xa4\\xaa\\xe9\\x99\\xbd \\xe0\\xb9\\x80\\xe0\\xb8\\xa5\\xe0\\xb8\\xb4\\xe0\\xb8\\xa8 
\\xe2\\x99\\x9c\\xe2\\x99\\x9e\\xe2\\x99\\x9d\\xe2\\x99\\x9b\\xe2\\x99\\x9a\\xe2\\x99\\x9d\\xe2\\x99
\\x9e\\xe2\\x99\\x9c \\xd0\\xbe\\xd0\\xb6\\xd0\\xb8\\xd0\\xb2\\xd0\\xbb\\xd1\\x91\\xd0\\xbd\\xd0\\xbd\\xd1\\x8b\\xd0\\xbc \\xe0\\xa4\\xa4\\xe0\\xa4\\xbe\\xe0\\xa4\\xa8\\xe0\\xa5\\x8d\\xe0\\xa4\\xaf\\xe0\\xa4\\xb9\\xe0\\xa4\\xbe\\xe0\\xa4\\xa8\\xe0\\xa4\\xbf \\xe6\\x9c\\x89\\xe6\\x9c\\x8b \\xe0\\xae\\xb8\\xe0\\xaf\\x8d\\xe0\\xae\\xb1\\xe0\\xaf\\x80\\xe0\\xae\\xa9\\xe0\\xae\\xbf\\xe0\\xae\\xb5\\xe0\\xae\\xbe\\xe0\\xae\\xb8 \\xd9\\xb1\\xd9\\x84\\xd8\\xb1\\xd9\\x8e\\xd9\\x91\\xd8\\xad\\xd9\\x92\\xd9\\x85\\xd9\\x80\\xd9\\x8e\\xd8\\xa8\\xd9\\x86\\xd9\\x90\\r\\nAccept-Encoding: gzip, deflate\\r\\nConnection: keep-alive\\r\\nAuthorization: Basic dGVzdDpbb25lIGxpbmUgb2YgVVRGOC1lbmNvZGVkIHVuaWNvZGUgdGV4dF0gz4fPgc+Fz4POsc+G
4b22IOWkqumZvSDguYDguKXguLTguKgg4pmc4pme4pmd4pmb4pma4pmd4pme4pmcINC+0LbQuN
Cy0LvRkdC90L3Ri9C8IOCkpOCkvuCkqOCljeCkr+CkueCkvuCkqOCkvyDmnInmnIsg4K644K+N4K6x4
K+A4K6p4K6/4K614K6+4K64INmx2YTYsdmO2ZHYrdmS2YXZgNmO2KjZhtmQ\\r\\n\\r\\n'

From a glance it doesn't look like it's RFC 2047. It looks like it's straight UTF-8:

In [25]: b'Test: [one line of UTF8-encoded unicode text] \xcf\x87\xcf\x81\xcf\x85\xcf\x83\xce\xb1\xcf\x86\xe1\xbd\xb6 \xe5\xa4\xaa\xe9\x99\xbd \xe0\xb9\x80\xe0\xb8\xa5\xe0\xb8\xb4\xe0\xb8\xa8 \xe2\x99\x9c\xe2\x99\x9e\xe2\x99\x9d\xe2\x99\x9b\xe2\x99\x9a\xe2\x99\x9d\xe2\x99\x9e\xe2\x99\x9c \xd0\xbe\xd0\xb6\xd0\xb8\xd0\xb2\xd0\xbb\xd1\x91\xd0\xbd\xd0\xbd\xd1\x8b\xd0\xbc \xe0\xa4\xa4\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa5\x8d\xe0\xa4\xaf\xe0\xa4\xb9\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xbf \xe6\x9c\x89\xe6\x9c\x8b \xe0\xae\xb8\xe0\xaf\x8d\xe0\xae\xb1\xe0\xaf\x80\xe0\xae\xa9\xe0\xae\xbf\xe0\xae\xb5\xe0\xae\xbe\xe0\xae\xb8 \xd9\xb1\xd9\x84\xd8\xb1\xd9\x8e\xd9\x91\xd8\xad\xd9\x92\xd9\x85\xd9\x80\xd9\x8e\xd8\xa8\xd9\x86\xd9\x90'.decode('utf-8')
Out[25]: 'Test: [one line of UTF8-encoded unicode text] χρυσαφὶ 太陽 เลิศ ♜♞♝♛♚♝♞♜ оживлённым तान्यहानि 有朋 ஸ்றீனிவாஸ ٱلرَّحْمـَبنِ'

That seems incorrect.

msabramo commented 9 years ago

Reproducing the core problem very simply in an IPython session:

In [44]: import email.parser, http.client

In [45]: hstring = 'Host: 127.0.0.1:63531\r\nUser-Agent: HTTPie/0.9.0-dev\r\nAccept: */*\r\nTest: [one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85Ï\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ \x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85Ù\x80Ù\x8eبÙ\x86Ù\x90\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nAuthorization: Basic dGVzdDpbb25lIGxpbmUgb2YgVVRGOC1lbmNvZGVkIHVuaWNvZGUgdGV4dF0gz4fPgc+Fz4POsc+G4b22IOWkqumZvSDguYDguKXguLTguKgg4pmc4pme4pmd4pmb4pma4pmd4pme4pmcINC+0LbQuNCy0LvRkdC90L3Ri9C8IOCkpOCkvuCkqOCljeCkr+CkueCkvuCkqOCkvyDmnInmnIsg4K644K+N4K6x4K+A4K6p4K6/4K614K6+4K64INmx2YTYsdmO2ZHYrdmS2YXZgNmO2KjZhtmQ\r\n\r\n'

In [46]: hm = email.parser.Parser(_class=http.client.HTTPMessage).parsestr(hstring)

In [47]: str(hm)
Out[47]: 'Host: 127.0.0.1:63531\nUser-Agent: HTTPie/0.9.0-dev\nAccept: */*\nTest: =?utf-8?b?W29uZSBsaW5lIG9mIFVURjgtZW5jb2RlZCB1bmljb2RlIHRleHRdIMOPwofDj8KBw48=?=\n\nÏ\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ \x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85\nÙ\x80Ù\x8eبÙ\x86Ù\x90\nAccept-Encoding: gzip, deflate\nConnection: keep-alive\nAuthorization: Basic dGVzdDpbb25lIGxpbmUgb2YgVVRGOC1lbmNvZGVkIHVuaWNvZGUgdGV4dF0gz4fPgc+Fz4POsc+G4b22IOWkqumZvSDguYDguKXguLTguKgg4pmc4pme4pmd4pmb4pma4pmd4pme4pmcINC+0LbQuNCy0LvRkdC90L3Ri9C8IOCkpOCkvuCkqOCljeCkr+CkueCkvuCkqOCkvyDmnInmnIsg4K644K+N4K6x4K+A4K6p4K6/4K614K6+4K64INmx2YTYsdmO2ZHYrdmS2YXZgNmO2KjZhtmQ\n\n'

In [48]: hm.items()
Out[48]:
[('Host', '127.0.0.1:63531'),
 ('User-Agent', 'HTTPie/0.9.0-dev'),
 ('Accept', '*/*'),
 ('Test', '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85')]

In [49]: hm.defects
Out[49]: [email.errors.MissingHeaderBodySeparatorDefect()]

Perhaps most interesting is that midway through the value of str(hm), in the middle of the value for the Test header, there is a double newline -- \n\n. I could imagine this could cause the parser to choke.

In [82]: str(hm)[146:151]
Out[82]: '=?=\n\n'
msabramo commented 9 years ago

Strangely, if I manually construct the header, things seem to work better:

In [63]: hm2 = http.client.HTTPMessage()

In [64]: hm2.add_header('Test', '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85Ï\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ \x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85Ù\x80Ù\x8eبÙ\x86Ù\x90')

In [65]: str(hm2)
Out[65]: 'Test: =?utf-8?b?W29uZSBsaW5lIG9mIFVURjgtZW5jb2RlZCB1bmljb2RlIHRleHRdIMOPwofDj8KBw48=?=\n =?utf-8?b?IMOPwoPDjsKxw4/ChsOhwr3CtiDDpcKkwqrDqcKZwr0gw6DCucKAw6DCuMKlw6DCuMK0w6DCuMKoIMOiwpnCnMOiIMKZwp7DosKZwp3DosKZwpvDosKZwprDosKZwp3DosKZwp7DosKZwpwgw5DCvsOQwrbDkMK4w5DCssOQwrvDkcKRw5DCvcOQwr3DkcKLw5DCvCDDoMKkwqTDoMKkwr7DoMKkwqjDoMKlwo3DoMKkwq/DoMKkwrnDoMKkwr7DoMKkwqjDoMKkwr8gw6bCnMKJw6bCnMKLIMOgwq7CuMOgwq/CjcOgwq7CscOgwq/CgMOgwq7CqcOgwq7Cv8Ogwq7CtcOgwq7CvsOgwq7CuCDDmcKxw5nChMOYwrHDmcKOw5nCkcOYwq3DmcKSw5k=?=\n =?utf-8?b?IMOZwoDDmcKOw5jCqMOZwobDmcKQ?=\n\n'

In [66]: hm2.items()
Out[66]:
[('Test',
  '[one line of UTF8-encoded unicode text] Ï\x87Ï\x81Ï\x85Ï\x83αÏ\x86ὶ 太é\x99½ à¹\x80ลิศ â\x99\x9câ \x99\x9eâ\x99\x9dâ\x99\x9bâ\x99\x9aâ\x99\x9dâ\x99\x9eâ\x99\x9c оживлÑ\x91ннÑ\x8bм तानà¥\x8dयहानि æ\x9c\x89æ\x9c\x8b ஸà¯\x8dà®±à¯\x80னிவாஸ Ù±Ù\x84رÙ\x8eÙ\x91Ø\xadÙ\x92Ù\x85Ù\x80Ù\x8eبÙ\x86Ù\x90')]

In [67]: hm2.defects
Out[67]: []

Note how in this case, str(hm2) ends up having two chunks of RFC 2047 text, denoted by =?utf-8?, whereas the previous example had only one (previous example seems to have \n\n in that place, which seems like it could totally confuse the parser...). End result is that hm2.items() returns a much longer value for the Test header.

It is curious that I was able to call add_header and have things work, but somehow this is not working in the original code path.

msabramo commented 9 years ago

The httpie.client.encode_headers function is currently encoding to utf-8. From my understanding of the RFC, this doesn't seem right? Perhaps we should be using the RFC 2047 style encoding that the email.header module implements?

See: https://github.com/jakubroztocil/httpie/pull/281 -- tests are failing though.

I cc'd flufl @warsaw, because he has his name on a lot of the stdlib code for email and HTTP header parsing.

msabramo commented 9 years ago

So I've mostly focused on test_session_unicode but a few other tests are failing intermittently with a BadStatusLine exception. For those I am wondering if it could be a pytest-httpbin problem, perhaps when there are 2 requests in a row using the same requests.Session?

Cc: @kevin1024

msabramo commented 9 years ago

I reproduced test failures in https://github.com/kevin1024/pytest-httpbin/pull/13

msabramo commented 9 years ago

See: https://github.com/kevin1024/pytest-httpbin/pull/16

msabramo commented 9 years ago

Using a pytest-httpbin with the fix in kevin1024/pytest-httpbin#16 seems to help. Observe:

$ .tox/py34/bin/pip uninstall -y pytest-httpbin
Uninstalling pytest-httpbin:
  Successfully uninstalled pytest-httpbin

$ .tox/py34/bin/pip install https://github.com/msabramo/pytest-httpbin/archive/fix_lots_of_requests_in_single_session.zip
...
Successfully installed pytest-httpbin
Cleaning up...

$ .tox/py34/bin/py.test --tb=short tests/test_sessions.py
============================================================================= test session starts ==============================================================================
platform darwin -- Python 3.4.0 -- py-1.4.26 -- pytest-2.6.4
plugins: httpbin
collected 7 items

tests/test_sessions.py .....F.

=================================================================================== FAILURES ===================================================================================
_______________________________________________________________________ TestSession.test_session_unicode _______________________________________________________________________
tests/test_sessions.py:148: in test_session_unicode
    assert (r2.json['headers']['Authorization']
E   KeyError: 'Authorization'
----------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------
127.0.0.1 - - [01/Dec/2014 08:09:00] "GET /get HTTP/1.1" 200 333
127.0.0.1 - - [01/Dec/2014 08:09:00] "GET /get HTTP/1.1" 200 333
================================================================ 1 failed, 6 passed, 1 warnings in 0.82 seconds ================================================================

Wow, I should've split this into 2 issues:

  1. the KeyError: 'Authorization' error in TestSession.test_session_unicode
  2. the requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",)) errors in TestSessionFlow.test_session_update and `TestSessionFlow.test_session_read_only

Maybe I'll go ahead and create 2 separate issues for these and keep this around as a master issue to track those two sub-issues.

msabramo commented 9 years ago

OK, created two separate issues:

  1. https://github.com/jakubroztocil/httpie/issues/283 -- BadStatusLine errors in TestSessionFlow.{test_session_update,test_session_read_only}
  2. https://github.com/jakubroztocil/httpie/issues/282 -- KeyError: 'Authorization' error in TestSession.test_session_unicode
msabramo commented 9 years ago

OK, https://github.com/jakubroztocil/httpie/issues/283 seems to be fixed by upgrading to pytest-httpbin 0.5.0.

So now it's just #282 that needs to be addressed. Personally, I'm pretty busy now and don't see myself having time to work on it in the short-term.

jkbrzt commented 9 years ago

It looks like this can be closed now. Huge thanks, @msabramo! :+1: