Open AdrianoKF opened 1 year ago
I have experienced similar problems with gRPC over HTTP, which uses HTTP/2 as the transport protocol.
@AdrianoKF What where the symptoms? I'm also debugging a gRPC issue. Works fine on all machines I have access to but one of our users keeps getting "connectex: No connection could be made because the target machine actively refused it." despite it working fine when we test e.g nginx.
I have experienced similar problems with gRPC over HTTP, which uses HTTP/2 as the transport protocol.
@AdrianoKF What where the symptoms? I'm also debugging a gRPC issue. Works fine on all machines I have access to but one of our users keeps getting "connectex: No connection could be made because the target machine actively refused it." despite it working fine when we test e.g nginx.
In my case I received an application-level gRPC error message, leading to all gRPC calls to fail:
rpc error: code = Unavailable desc = connection closed before server preface received
I tracked down the origin by capturing traffic inside the container using Wireshark/tshark
, where I noticed the error message you are also seeing as a response packet to the initial HTTP/2 magic request.
It's a bit tricky to debug depending on the tool used to investigate, since curl
, e.g., will send an HTTP/1.1 request with an Upgrade
header to establish an HTTP/2 connection, instead of sending out HTTP/2 traffic right away. Also, packet captures are quite different inside the container and on the host (as is expected from my understanding of the vpnkit
architecture -- which I only learned about after my initial root cause analysis).
Ok this looks different, we can't even connect to the port which is super strange since we don't seem to be able to reproduce it with e.g a nginx container. I've filled #13283 for that but maybe it has a similar root cause in vpnkit..
There hasn't been any activity on this issue for a long time.
If the problem is still relevant, mark the issue as fresh with a /remove-lifecycle stale
comment.
If not, this issue will be closed in 30 days.
Prevent issues from auto-closing with a /lifecycle frozen
comment.
/lifecycle stale
/remove-lifecycle stale
3D9D6149-F0AF-45B6-872A-B9745FBF1289/20230222145441
Actual behavior
Note: See below for steps to reproduce, for simplicity this runs a command inside an
alpine:latest
container.Trying to make an HTTP/1.0 request without a host header fails with an error message of unclear origin (see Information section below on my best guess):
I have experienced similar problems with gRPC over HTTP, which uses HTTP/2 as the transport protocol.
Expected behavior
Using
curl
to access httpbin should return the HTTP response from the server:Information
This behavior is reproducible and happens for all outgoing HTTP traffic on port 80.
I have been able to reproduce it on Docker Desktop running on Windows 11 (running on bare metal) with the WSL2 backend. From what I can tell, the error is caused by the missing
Host
header, which seems to upset the transparent HTTP proxying happening insidevpnkit
(that's as far as I managed to understand the root cause - happy to report my findings if it helps). The error message makes a mention of<nil>:80
, which seems to indicate that the proxy unsuccessfully tried to determine the target of the HTTP request and just falls back to an empty value instead.This behavior breaks two valid use cases:
Host
header (which is not required in RFC 1945)PRI * HTTP/2\r\n\r\nSM\r\n\r\n
string in their connection preface (see RFC 9113, section 3.4 -- this is how I encountered the bug, it broke a gRPC call to a server running unencrypted gRPC, which uses HTTP/2 as its transport, over port 80)As a side note: the internal behavior of the transparent proxy also makes for some additional weird behavior in cases where the "original" TCP endpoint and the
Host
header disagree, causing the proxy to completely disregard the original destination (other than to send aSYN
packet to see if the connection can be established):The response is actually returned from
google.com
, despite the request clearly being intended forfacebook.com
. Regardless of the above issues, this is very surprising behavior in its own right.Output of
& "C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe" check
Steps to reproduce the behavior
Execute on Windows with WSL2 backend;
docker build --no-cache --progress=plain .
to see the relevant output: