aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.13k stars 2.02k forks source link

no_proxy setting is not honored in connector #8549

Open sta4152 opened 3 months ago

sta4152 commented 3 months ago

Describe the bug

We try to connect to a kubernetes cluster, residing behind a proxy that is needed for downloading external resources, but not for connecting to the cluster. In the proxy log we observe access to 10.152.183.1:443, which is the cluster's api server, and whose network is excluded using no_proxy/NO_PROXY environment. We narrowed it doen to (at least) https://github.com/aio-libs/aiohttp/blob/a561fa990427383358b19dc1eabc968e03a95413/aiohttp/connector.py#L928

Here it is asked only, if there is set a proxy. Not considering the target to establish the connection to. And thus not considering the exclusion for the k9s api server address.

To Reproduce

You should be able to test much easier, here is just what we used for testing:

  1. having a kubernetes cluster
  2. create within it a pod having set a http/https_proxy and no_proxy excluding the kubernetes internal networking, e.g. 10.152.183.0/24 (with default k9s networking)
  3. run
    
    api_client = kubernetes.client.api_client.ApiClient()
    custom_objects_api = kubernetes.client.CustomObjectsApi(api_client)
    await custom_objects_api.get_namespaced_custom_object(
                    group="",
                    version="v1",
                    plural="pods",
                    namespace="default",
                    name="test",
                )```

Expected behavior

The proxy must not be accessed for addresses mentioned in no_proxy/NO_PROXY

Logs/tracebacks

Here: run from within a notebook
Please note the line "941 if req.proxy" by which the branch is chosen to contact the proxy server:

---------------------------------------------------------------------------
ConnectionResetError                      Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/aiohttp/connector.py:1137, in TCPConnector._start_tls_connection(self, underlying_transport, req, timeout, client_error)
   1136 try:
-> 1137     tls_transport = await self._loop.start_tls(
   1138         underlying_transport,
   1139         tls_proto,
   1140         sslcontext,
   1141         server_hostname=req.server_hostname or req.host,
   1142         ssl_handshake_timeout=timeout.total,
   1143     )
   1144 except BaseException:
   1145     # We need to close the underlying transport since
   1146     # `start_tls()` probably failed before it had a
   1147     # chance to do this:

File /opt/conda/lib/python3.11/asyncio/base_events.py:1267, in BaseEventLoop.start_tls(self, transport, protocol, sslcontext, server_side, server_hostname, ssl_handshake_timeout, ssl_shutdown_timeout)
   1266 try:
-> 1267     await waiter
   1268 except BaseException:

File /opt/conda/lib/python3.11/asyncio/sslproto.py:575, in SSLProtocol._on_handshake_complete(self, handshake_exc)
    574 else:
--> 575     raise handshake_exc
    577 peercert = sslobj.getpeercert()

ConnectionResetError: 

The above exception was the direct cause of the following exception:

ClientConnectorError                      Traceback (most recent call last)
Cell In[41], line 15
     13 api_client = kubernetes.client.api_client.ApiClient()
     14 custom_objects_api = kubernetes.client.CustomObjectsApi(api_client)
---> 15 await custom_objects_api.get_namespaced_custom_object(
     16                     group="",
     17                     version="v1",
     18                     plural="pods",
     19                     namespace="default",
     20                     name="test",
     21                 )

File /opt/conda/lib/python3.11/site-packages/kubernetes_asyncio/client/api_client.py:185, in ApiClient.__call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_types_map, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _request_auth)
    181     url = _host + resource_path
    183 try:
    184     # perform request and return response
--> 185     response_data = await self.request(
    186         method, url, query_params=query_params, headers=header_params,
    187         post_params=post_params, body=body,
    188         _preload_content=_preload_content,
    189         _request_timeout=_request_timeout)
    190 except ApiException as e:
    191     e.body = e.body.decode('utf-8') if six.PY3 else e.body

File /opt/conda/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py:198, in RESTClientObject.GET(self, url, headers, query_params, _preload_content, _request_timeout)
    196 async def GET(self, url, headers=None, query_params=None,
    197               _preload_content=True, _request_timeout=None):
--> 198     return (await self.request("GET", url,
    199                                headers=headers,
    200                                _preload_content=_preload_content,
    201                                _request_timeout=_request_timeout,
    202                                query_params=query_params))

File /opt/conda/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py:182, in RESTClientObject.request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    177         msg = """Cannot prepare a request message for provided
    178                  arguments. Please check that your arguments match
    179                  declared content type."""
    180         raise ApiException(status=0, reason=msg)
--> 182 r = await self.pool_manager.request(**args)
    183 if _preload_content:
    185     data = await r.read()

File /opt/conda/lib/python3.11/site-packages/aiohttp/client.py:581, in ClientSession._request(self, method, str_or_url, params, data, json, cookies, headers, skip_auto_headers, auth, allow_redirects, max_redirects, compress, chunked, expect100, raise_for_status, read_until_eof, proxy, proxy_auth, timeout, verify_ssl, fingerprint, ssl_context, ssl, server_hostname, proxy_headers, trace_request_ctx, read_bufsize, auto_decompress, max_line_size, max_field_size)
    576     async with ceil_timeout(
    577         real_timeout.connect,
    578         ceil_threshold=real_timeout.ceil_threshold,
    579     ):
    580         assert self._connector is not None
--> 581         conn = await self._connector.connect(
    582             req, traces=traces, timeout=real_timeout
    583         )
    584 except asyncio.TimeoutError as exc:
    585     raise ServerTimeoutError(
    586         "Connection timeout " "to host {}".format(url)
    587     ) from exc

File /opt/conda/lib/python3.11/site-packages/aiohttp/connector.py:544, in BaseConnector.connect(self, req, traces, timeout)
    541         await trace.send_connection_create_start()
    543 try:
--> 544     proto = await self._create_connection(req, traces, timeout)
    545     if self._closed:
    546         proto.close()

File /opt/conda/lib/python3.11/site-packages/aiohttp/connector.py:942, in TCPConnector._create_connection(self, req, traces, timeout)
    937 """Create connection.
    938 
    939 Has same keyword arguments as BaseEventLoop.create_connection.
    940 """
    941 if req.proxy:
--> 942     _, proto = await self._create_proxy_connection(req, traces, timeout)
    943 else:
    944     _, proto = await self._create_direct_connection(req, traces, timeout)

File /opt/conda/lib/python3.11/site-packages/aiohttp/connector.py:1379, in TCPConnector._create_proxy_connection(self, req, traces, timeout)
   1369         sslcontext = self._get_ssl_context(req)
   1370         return await self._wrap_create_connection(
   1371             self._factory,
   1372             timeout=timeout,
   (...)
   1376             req=req,
   1377         )
-> 1379     return await self._start_tls_connection(
   1380         # Access the old transport for the last time before it's
   1381         # closed and forgotten forever:
   1382         transport,
   1383         req=req,
   1384         timeout=timeout,
   1385     )
   1386 finally:
   1387     proxy_resp.close()

File /opt/conda/lib/python3.11/site-packages/aiohttp/connector.py:1157, in TCPConnector._start_tls_connection(self, underlying_transport, req, timeout, client_error)
   1155     if exc.errno is None and isinstance(exc, asyncio.TimeoutError):
   1156         raise
-> 1157     raise client_error(req.connection_key, exc) from exc
   1158 except TypeError as type_err:
   1159     # Example cause looks like this:
   1160     # TypeError: transport <asyncio.sslproto._SSLProtocolTransport
   1161     # object at 0x7f760615e460> is not supported by start_tls()
   1163     raise ClientConnectionError(
   1164         "Cannot initialize a TLS-in-TLS connection to host "
   1165         f"{req.host!s}:{req.port:d} through an underlying connection "
   1166         f"to an HTTPS proxy {req.proxy!s} ssl:{req.ssl or 'default'} "
   1167         f"[{type_err!s}]"
   1168     ) from type_err

ClientConnectorError: Cannot connect to host 10.152.183.1:443 ssl:default [None]

Python Version

Python 3.11.6

aiohttp Version

aiohttp 3.9.5

multidict Version

multidict 6.0.5

yarl Version

yarl 1.9.4

OS

Linux testing-0 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Related component

Client

Additional context

No response

Code of Conduct

Dreamsorcerer commented 2 months ago

If you want us to look at this, you'll probably have to create a test (e.g. in test_proxy.py or test_proxy_functional.py) that demonstrates the issue.

sta4152 commented 1 month ago

Sorry, I am no python dev.

I understand the if statement from the report (and the code from the stacktrace), but I am not in the position to author a test for testing this within your framework.

Dreamsorcerer commented 1 month ago

Well step 1 of your reproducer pretty much rules out the chance of anybody looking at this..

sta4152 commented 1 month ago

Sorry again, I am just the indirect user of that library. And this is the only option for me to reproduce it.

But if you have a look to the we narrowed it down to, then it is obvious, that with this ifcondition from you libraries code, the proxy is used for all destinations, without considering the additionally configured no_proxy (AFAIR you even have a utilities method for checking, if for a url the proxy should be used and that considers the no_proxy settings, but it is "just" not used there).

It may happen, that with having fixed it there, this may not be sufficient from top-level point of view (kubernetes cluster), yet. But you shoud be in the position to crete a test for that mentioned piece of code and verify my report for that (that me, as said not being a python dev, is not able to create).

Dreamsorcerer commented 1 month ago

But you shoud be in the position to crete a test for that

To put that into context, this project is over a decade old and written by hundreds of contributors. Speaking for myself, I have never even used the proxy functionality in this project, nevermind written any of it. I have no idea how this NO_PROXY envvar is supposed to work (searching for it in the code shows a mention in the docs and a couple of tests in test_helpers.py, but no reference in the code itself), and don't know how to verify this behaviour.

I'm just telling you what needs to happen to significantly increase your chances of someone else looking at the issue. Unless an issue is directly affecting a maintainer (who is a volunteer with limited time), it's unlikely that they will spend hours trying to figure out your problem. If you can provide a test (or failing that, a copy/paste reproducer that can be immediately run locally), there is a good chance someone will look at fixing it. If not, chances are slim.