Open jonathanslenders opened 9 months ago
Thanks @jonathanslenders.
First places I'd start narrowing this down are...
httpx
using a sync client?httpx
async running under trio?requests
, aiohttp
, curl
.)hypercorn
and uvicorn
?Hi @jonathanslenders
it seems to me that the problem is hypercorn
specific
it just happened to me on a similar setup,
i think hypercorn is closing the connection in some unexpected way when using http2
Thank you @tomchristie for looking into it!
I've been diving deeper. Maybe I have a fix (see below).
I think the problem is due to the complexity of terminating TLS connections. It's hard to half-close a TLS connection, and by doing so, the underlying TCP stream is not necessarily closed, which makes it very hard for the other end to observe a half closed TLS connection.
From httpx's point of view, when performing the second request, the HTTPConnectionState
is IDLE
, and _network_stream.get_extra_info('is_readable')
is True
. The TCP connection is still full open in both directions, and the data that is readable at this point could very well be TLS unwrap data.
Our is_socket_readable
check works fine for non-TLS connections, but for a TLS connection, we have to decrypt any pending traffic to see whether the TLS stream was not terminated. Basically, that means passing any buffered data through the ssl.MemoryBio
and see whether an EOF/ssl.SSLZeroReturnError
comes out at the other end, indicating that the other party unwrapped the connection.
From Hypercorn's point of view, I can see that Hypercorn does call asyncio.StreamWriter.close()
after the idle timeout. However, the transport is of type asyncio.sslproto._SSLProtocolTransport
, which means there's no guarantee that the actual socket is closed. Looking at netstat, the TCP socket remains ESTABLISHED even after Hypercorn tries to close the writer. (@pgjones, you might want to look into this.)
Diving into httpcore, and looking at httpcore/_async/http2
, in handle_async_request
, I don't see is_socket_readable
being called at any point in time. Where do we check when the connection is still alive for http/2+TLS connections? I don't think we do that.
The following httpcore patch does detect whether the connection is alive. I don't know which exception to raise, but this could be a starting point:
diff --git a/httpcore/_async/http2.py b/httpcore/_async/http2.py
index 8dc776f..e08e018 100644
--- a/httpcore/_async/http2.py
+++ b/httpcore/_async/http2.py
@@ -136,6 +136,10 @@ class AsyncHTTP2Connection(AsyncConnectionInterface):
self._request_count -= 1
raise ConnectionNotAvailable()
+ alive = await self._network_stream.is_alive()
+ if not alive:
+ raise Exception('Not alive!')
+
try:
kwargs = {"request": request, "stream_id": stream_id}
async with Trace("send_request_headers", logger, request, kwargs):
diff --git a/httpcore/_backends/anyio.py b/httpcore/_backends/anyio.py
index 1ed5228..d18217f 100644
--- a/httpcore/_backends/anyio.py
+++ b/httpcore/_backends/anyio.py
@@ -1,5 +1,6 @@
import ssl
import typing
+from collections import deque
import anyio
@@ -19,10 +20,13 @@ from .base import SOCKET_OPTION, AsyncNetworkBackend, AsyncNetworkStream
class AnyIOStream(AsyncNetworkStream):
def __init__(self, stream: anyio.abc.ByteStream) -> None:
self._stream = stream
+ self._buffer = deque()
async def read(
self, max_bytes: int, timeout: typing.Optional[float] = None
) -> bytes:
+ if len(self._buffer) > 0:
+ return self._buffer.popleft()
exc_map = {
TimeoutError: ReadTimeout,
anyio.BrokenResourceError: ReadError,
@@ -92,6 +96,25 @@ class AnyIOStream(AsyncNetworkStream):
return is_socket_readable(sock)
return None
+ async def is_alive(self) -> bool:
+ """
+ Test whether the connection is still alive. If this is a TLS stream,
+ then that is different from testing whether the socket is still alive.
+ It's possible that the other end did a half-close of the TLS stream,
+ but not a half-close of the underlying TCP stream.
+
+ We test whether it's alive by trying to read data. When we get b"", we
+ know EOF was reached.
+ """
+ try:
+ data = await self.read(max_bytes=1, timeout=0.01)
+ except ReadTimeout:
+ return True
+ if data == b'':
+ return False
+ self._buffer.append(data)
+ return True
+
class AnyIOBackend(AsyncNetworkBackend):
async def connect_tcp(
Looks similar to the previously resolved https://github.com/encode/httpcore/pull/679
Thanks for all the detailed commentary!
When httpx is used to connect to an http/2 server, and the server disconnects, then httpx will attempt to reuse this connection for the following request. It happens when
keepalive_expiry
in the httpx limits is bigger than thekeep_alive_timeout
of the HTTP/2 web server.To reproduce, we can run a Hypercorn http/2 server using a dummy Starlette application like this (TLS is required because of http/2):
Then run this client code:
Important is to ensure
http2=True
is set andkeepalive_expiry
isNone
or a value bigger than the corresponding option of the Hypercorn server.For the second request, httpx tries to reuse the connection from the connection pool although it has been closed by the other side in the meantime and fails, immediately producing the following
Server disconnected
error:I started by commenting in this discussion: https://github.com/encode/httpx/discussions/2056 , but turned it in an issue after I had a clear reproducible case. Feel free to close if it's not appropriate.