Open JerryKwan opened 2 years ago
Let's turn this into an enhancement
, as its a known issue and ConnectionPool
is a WIP, as advertised by its flag description Default: False. (WIP) Enable upstream connection pooling
. Other folks watching a bug
label may get alarmed :D
@JerryKwan Assigned it back to you, it mades more sense as you are working on it.
@JerryKwan This update is unrelated to your PR. Something that I ran into and I wanted to document here for posterity:
ConnectionPool
is now UpstreamConnectionPool
. Today, I tried it with following:
proxy.py
with connection poolpython -m proxy --log-level d --num-acceptors 1 --enable-conn-pool
requests
packageFirst one works fine. Second one runs into ssl.SSLError
, which happens due to lack of support for HTTPS connection re-use. See below for server side debug logs
>>> r = requests.get("https://httpbin.org/get", proxies={"https": "http://localhost:8899"})
>>> r = requests.get("https://httpbin.org/get", proxies={"https": "http://localhost:8899"})
Traceback (most recent call last):
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/connection.py", line 416, in connect
self.sock = ssl_wrap_socket(
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/Users/abhinavsingh/.pyenv/versions/3.10.0/lib/python3.10/ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "/Users/abhinavsingh/.pyenv/versions/3.10.0/lib/python3.10/ssl.py", line 1070, in _create
self.do_handshake()
File "/Users/abhinavsingh/.pyenv/versions/3.10.0/lib/python3.10/ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: INVALID_ALERT] invalid alert (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/urllib3/util/retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /get (Caused by SSLError(SSLError(1, '[SSL: INVALID_ALERT] invalid alert (_ssl.c:997)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/Users/abhinavsingh/Dev/proxy.py/venv310/lib/python3.10/site-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /get (Caused by SSLError(SSLError(1, '[SSL: INVALID_ALERT] invalid alert (_ssl.c:997)')))
>>>
2021-12-29 17:48:22,318 - pid:60879 [D] acceptor.accept:119 - Accepting new work#30
2021-12-29 17:48:22,319 - pid:60879 [D] tcp_server.__init__:53 - Work#30 accepted from 127.0.0.1:49386
2021-12-29 17:48:22,319 - pid:60879 [D] handler.initialize:106 - Handling connection 127.0.0.1:49386
2021-12-29 17:48:22,319 - pid:60879 [D] threadless._update_work_events:210 - fd#30 registered for mask#1 by work#30
2021-12-29 17:48:22,320 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:22,320 - pid:60879 [D] connection.recv:64 - received 36 bytes from client
2021-12-29 17:48:22,320 - pid:60879 [D] server.connect_upstream:598 - Connecting to upstream httpbin.org:443
2021-12-29 17:48:22,615 - pid:60879 [D] pool.acquire:102 - Created new connection#4388713184 for upstream httpbin.org:443
2021-12-29 17:48:22,615 - pid:60879 [D] server.connect_upstream:633 - Connected to upstream httpbin.org:443
2021-12-29 17:48:22,615 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:22,615 - pid:60879 [D] threadless._update_work_events:210 - fd#34 registered for mask#1 by work#30
2021-12-29 17:48:22,616 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:22,616 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49386
2021-12-29 17:48:22,616 - pid:60879 [D] connection.flush:96 - flushed 39 bytes to client
2021-12-29 17:48:22,616 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#1 by work#30
2021-12-29 17:48:22,625 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:22,626 - pid:60879 [D] connection.recv:64 - received 517 bytes from client
2021-12-29 17:48:22,626 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#3 by work#30
2021-12-29 17:48:22,626 - pid:60879 [D] server.write_to_descriptors:220 - Server is write ready, flushing...
2021-12-29 17:48:22,626 - pid:60879 [D] connection.flush:96 - flushed 517 bytes to server
2021-12-29 17:48:22,626 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#1 by work#30
2021-12-29 17:48:22,997 - pid:60879 [D] server.read_from_descriptors:257 - Server is read ready, receiving...
2021-12-29 17:48:22,997 - pid:60879 [D] connection.recv:64 - received 5404 bytes from server
2021-12-29 17:48:22,997 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:22,998 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:22,998 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49386
2021-12-29 17:48:22,998 - pid:60879 [D] connection.flush:96 - flushed 5404 bytes to client
2021-12-29 17:48:22,998 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#1 by work#30
2021-12-29 17:48:22,999 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:23,000 - pid:60879 [D] connection.recv:64 - received 126 bytes from client
2021-12-29 17:48:23,000 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#3 by work#30
2021-12-29 17:48:23,000 - pid:60879 [D] server.write_to_descriptors:220 - Server is write ready, flushing...
2021-12-29 17:48:23,000 - pid:60879 [D] connection.flush:96 - flushed 126 bytes to server
2021-12-29 17:48:23,001 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#1 by work#30
2021-12-29 17:48:23,268 - pid:60879 [D] server.read_from_descriptors:257 - Server is read ready, receiving...
2021-12-29 17:48:23,269 - pid:60879 [D] connection.recv:64 - received 51 bytes from server
2021-12-29 17:48:23,269 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:23,269 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:23,270 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49386
2021-12-29 17:48:23,270 - pid:60879 [D] connection.flush:96 - flushed 51 bytes to client
2021-12-29 17:48:23,270 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#1 by work#30
2021-12-29 17:48:23,270 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:23,271 - pid:60879 [D] connection.recv:64 - received 174 bytes from client
2021-12-29 17:48:23,271 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#3 by work#30
2021-12-29 17:48:23,271 - pid:60879 [D] server.write_to_descriptors:220 - Server is write ready, flushing...
2021-12-29 17:48:23,272 - pid:60879 [D] connection.flush:96 - flushed 174 bytes to server
2021-12-29 17:48:23,272 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#1 by work#30
2021-12-29 17:48:23,610 - pid:60879 [D] server.read_from_descriptors:257 - Server is read ready, receiving...
2021-12-29 17:48:23,610 - pid:60879 [D] connection.recv:64 - received 567 bytes from server
2021-12-29 17:48:23,610 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:23,611 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:23,611 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49386
2021-12-29 17:48:23,611 - pid:60879 [D] connection.flush:96 - flushed 567 bytes to client
2021-12-29 17:48:23,611 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#1 by work#30
2021-12-29 17:48:23,612 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:23,612 - pid:60879 [D] tcp_server.handle_readables:114 - Connection closed by client 127.0.0.1:49386
2021-12-29 17:48:23,612 - pid:60879 [D] threadless._cleanup:317 - fd#30 unregistered by work#30
2021-12-29 17:48:23,612 - pid:60879 [D] threadless._cleanup:317 - fd#34 unregistered by work#30
2021-12-29 17:48:23,612 - pid:60879 [I] server.access_log:417 - 127.0.0.1:49386 - CONNECT httpbin.org:443 - 6022 bytes - 1293.04ms
2021-12-29 17:48:23,612 - pid:60879 [D] pool.release:124 - Retaining connection#4388713184 to upstream httpbin.org:443
2021-12-29 17:48:23,612 - pid:60879 [D] handler.shutdown:125 - Closing client connection 127.0.0.1:49386 has buffer False
2021-12-29 17:48:23,612 - pid:60879 [D] handler.shutdown:135 - Client connection shutdown successful
2021-12-29 17:48:23,613 - pid:60879 [D] handler.shutdown:149 - Client connection closed
2021-12-29 17:48:30,718 - pid:60879 [D] acceptor.accept:119 - Accepting new work#30
2021-12-29 17:48:30,733 - pid:60879 [D] tcp_server.__init__:53 - Work#30 accepted from 127.0.0.1:49394
2021-12-29 17:48:30,734 - pid:60879 [D] handler.initialize:106 - Handling connection 127.0.0.1:49394
2021-12-29 17:48:30,734 - pid:60879 [D] threadless._update_work_events:210 - fd#30 registered for mask#1 by work#30
2021-12-29 17:48:30,734 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:30,734 - pid:60879 [D] connection.recv:64 - received 36 bytes from client
2021-12-29 17:48:30,735 - pid:60879 [D] server.connect_upstream:598 - Connecting to upstream httpbin.org:443
2021-12-29 17:48:30,735 - pid:60879 [D] pool.acquire:95 - Reusing connection#4388713184 for upstream httpbin.org:443
2021-12-29 17:48:30,735 - pid:60879 [I] server.connect_upstream:628 - Reusing connection to upstream httpbin.org:443
2021-12-29 17:48:30,735 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:30,735 - pid:60879 [D] threadless._update_work_events:194 - fd#34 borrowed with mask#1 by work#30
2021-12-29 17:48:30,736 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:30,736 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49394
2021-12-29 17:48:30,736 - pid:60879 [D] connection.flush:96 - flushed 39 bytes to client
2021-12-29 17:48:30,736 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#1 by work#30
2021-12-29 17:48:30,745 - pid:60879 [D] handler.handle_readables:279 - Client is read ready, receiving...
2021-12-29 17:48:30,745 - pid:60879 [D] connection.recv:64 - received 517 bytes from client
2021-12-29 17:48:30,745 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#3 by work#30
2021-12-29 17:48:30,745 - pid:60879 [D] server.write_to_descriptors:220 - Server is write ready, flushing...
2021-12-29 17:48:30,745 - pid:60879 [D] connection.flush:96 - flushed 517 bytes to server
2021-12-29 17:48:30,746 - pid:60879 [D] threadless._update_work_events:180 - fd#34 modified for mask#1 by work#30
2021-12-29 17:48:31,083 - pid:60879 [D] server.read_from_descriptors:257 - Server is read ready, receiving...
2021-12-29 17:48:31,083 - pid:60879 [D] connection.recv:64 - received 31 bytes from server
2021-12-29 17:48:31,084 - pid:60879 [D] threadless._update_work_events:180 - fd#30 modified for mask#3 by work#30
2021-12-29 17:48:31,084 - pid:60879 [D] handler.handle_writables:249 - Client is write ready, flushing...
2021-12-29 17:48:31,084 - pid:60879 [D] tcp_server.handle_writables:99 - Flushing buffer to client 127.0.0.1:49394
2021-12-29 17:48:31,085 - pid:60879 [D] connection.flush:96 - flushed 31 bytes to client
2021-12-29 17:48:31,085 - pid:60879 [D] server.read_from_descriptors:257 - Server is read ready, receiving...
2021-12-29 17:48:31,085 - pid:60879 [D] server.read_from_descriptors:292 - Server closed connection, tearing down...
2021-12-29 17:48:31,085 - pid:60879 [D] pool.release:117 - Removing connection#4388713184 from pool from upstream httpbin.org:443
2021-12-29 17:48:31,085 - pid:60879 [D] pool._remove:165 - Removing conn#4388713184 from pool
2021-12-29 17:48:31,085 - pid:60879 [D] threadless._cleanup:317 - fd#30 unregistered by work#30
2021-12-29 17:48:31,086 - pid:60879 [D] threadless._cleanup:317 - fd#34 unregistered by work#30
2021-12-29 17:48:31,086 - pid:60879 [I] server.access_log:417 - 127.0.0.1:49394 - CONNECT None:None - 31 bytes - 352.14ms
2021-12-29 17:48:31,086 - pid:60879 [D] handler.shutdown:125 - Closing client connection 127.0.0.1:49394 has buffer False
2021-12-29 17:48:31,086 - pid:60879 [D] handler.shutdown:135 - Client connection shutdown successful
2021-12-29 17:48:31,086 - pid:60879 [D] handler.shutdown:149 - Client connection closed
Solution to the above problem is that, when requests
module reaches handshake step:
self.do_handshake()
^^^ See requests
exception log above to find this line
proxy.py
must dispatch previously caches TLS session ticket. This is necessary because, reused HTTPS connection from the pool has already been SSL wrapped. But client will try to TLS wrap again (a.k.a TLS handshake), leading into protocol level exceptions.
So big question is, can proxy.py
obtain the TLS ticket or handshake information without TLS interception?
@JerryKwan Can you try to parse initial few packets from client to server (and vice-versa) using your TlsProtocolParser
class. Imagine, how Wireshark
is able to help us visualize all the TLS
protocol information without TLS interception. Similar, we might be able to parse the protocol bits just enough to extract out the SSL ticket (probably). Can you verify? Thank you!!!
To summarize, what we need is something like this:
openssl s_client -connect google.com:443 -reconnect -no_ticket -sess_out /tmp/google.session
openssl s_client -connect google.com:443 -reconnect -no_ticket -sess_in /tmp/google.session
cat /tmp/google.session
We need to achieve the same within proxy.py
:)
Any plans to implement this for https?
@abhinavsingh Any plans for https?
@noorul @JerryKwan Thank you for bring my attention to this. However, I am too occupied with my current assignment and hardly finding any bandwidth for proxy.py
these days. I might not be able to return back to it for next month or so atleast.
To summarize, what we need is something like this:
Make a request to google.com, cache SSL session
openssl s_client -connect google.com:443 -reconnect -no_ticket -sess_out /tmp/google.session
Use SSL cache to make another request to google.com
openssl s_client -connect google.com:443 -reconnect -no_ticket -sess_in /tmp/google.session
View SSL Session parameters
cat /tmp/google.session
We need to achieve the same within
proxy.py
:)
Thanks to this summary, we have a clear path forward on how to go about it, but will require some efforts to make it production ready. If someone wants to take a stab at it, I will surely try to find some time for review/merge efforts.
Describe the bug ConnectionPool works well for http requests, but fails for https request.
To Reproduce Steps to reproduce the behavior:
proxy.py
as 'python -m proxy --num-workers 1 --log-level DEBUG --enable-conn-pool'Expected behavior ConnectionPool works well for http and https requests