Closed stevenctl closed 2 years ago
cc @lambdai
cc @alyssawilk
Thank @stevenctl for drafting this.
@alyssawilk
I would propose tcp_proxy tunnel doesn't set :protocol
by default because tcp_proxy unlikely to proxy websocket
Also, if any protocol override is needed we can relax headers_to_add
here.
This relax aligns with other requirements of setting host headers in tunnel. I have some code implementing the relaxation without impacting the HCM router
Additional issue:
For HTTP 2 CONNECT rfc7540 8.3:
The ":scheme" and ":path" pseudo-header fields MUST be omitted.
AFAICT I can't use xDS to make Envoy initiate a proper H2 CONNECT.
I don't believe that your assesment is correct. The :protocol
pseudo-header is what differentiates "extended" CONNECT
from the "classic" version. Envoy is using the "extended" CONNECT
method for proxying TCP over HTTP/2, so it's correctly using it in this context.
It looks that Go doesn't support the "extended" version, and since it's an extension of the HTTP/2 protocol, it should be negotiated using SETTINGS_ENABLE_CONNECT_PROTOCOL
between HTTP/2 endpoints (here: Envoy & Go), but it sounds that Envoy might be using it without proper negotiation.
However, you cannot really proxy TCP over HTTP/2 without it (or without reinventing a custom framing or another method for that purpose), so it's unclear what should be the fallback.
I believe there is an option to use POST ( for 'legacy' servers ) - and it may be safer to default to it, unless it is explicitly known that the other side supports extended connect. POST will work with almost all http/2 servers, and has equivalent behavior (connect_config: allow_post: true)
Right, bytes can be sent using POST
(or even GET
) method, but that requires prior knowledge.
The upstream headers are prepared after the connection is established. Theoretically we can relies on the negotiated settings.
I am wondering if peeking the negotiated result is overkill.
I don't know if 'POST requires prior knowledge' - it is the most boring POST request, all servers and clients are supposed to support this without any other special handshake. Using extended CONNECT does require knowing that the other side support the extension - POST doesn't since it's universally supported.
No doubt extended CONNECT is more fancy and likely better matching the use case - but why take the extra complexity when a much simpler option exists that works with all existing infrastructure ?
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
It's not resolved:
I think we have two approach to connect to a server that is not catching up with the extended CONNECT (https://datatracker.ietf.org/doc/html/rfc8441#section-3)
We did some investigation of the RFCs and I believe this is not the expected protocol to use in this case (or, at least how we want to use CONNECT).
There are a few relevant RFCs here:
:protocol=bytestream". This is not an approved RFC, it is a draft that expired in 2019 (https://datatracker.ietf.org/doc/draft-kinnear-httpbis-http2-transport/). The only public reference to this RFC or
:protocol=bytestream` I could find is in Envoy itself: https://cs.github.com/?scopeName=All+repos&scope=&q=%2F%3Aprotocol.%2B%3F%3D.%2B%3Fbytestream%2F. This RFC also has the same mention as 844 that this should not be used by proxy servers: https://datatracker.ietf.org/doc/html/draft-kinnear-httpbis-http2-transport-02#section-4.In both of these protocols, negotiation is supposed to be done via SETTINGS_ENABLE_CONNECT_PROTOCOL.
We are hoping to proxy application traffic over a CONNECT tunnel (eg app -> envoy proxy, connect encap -> envoy proxy, connect decap --> app). This currently works fine because we have Envoy on each end, but we have experimented with substituting either end with a non-Envoy proxy and see issues there.
cc @wrowe @DavidSchinazi for thoughts
Also was the first link supposed to be https://datatracker.ietf.org/doc/html/rfc8441 ?
Setting :protocol=bytestream
is incorrect here. If we proxy TCP, then we're using regular CONNECT and not Extended CONNECT. Extended CONNECT exists to support cases that are explicitly not TCP proxying such as WebSocket, see the full list of defined protocols here. The "bytestream" protocol was written up by Apple for an internal use-case they have that does not involve TCP proxying, and they've abandoned that draft since.
Setting
:protocol=bytestream
is incorrect here. If we proxy TCP, then we're using regular CONNECT and not Extended CONNECT.
The regular CONNECT method cannot be used for proxying TCP over HTTP/2, since it requires server to open a TCP connection to the destination, and not to layer it on top of HTTP/2.
Extended CONNECT exists to support cases that are explicitly not TCP proxying such as WebSocket, see the full list of defined protocols here. The "bytestream" protocol was written up by Apple for an internal use-case they have that does not involve TCP proxying, and they've abandoned that draft since.
This is missing quite a bit of context.
RFC8441 supports only WebSockets because that was the use case that needed immediate solution at the time. Before it was finalized, I approached Patrick about adding bytestream
to support TCP over HTTP/2, and he agreed that it was a good idea, but said that adding it to the draft would delay the standardization process for WebSockets, for which browsers needed solution ASAP. My plan was to write a separate RFC for it, but I stopped that work after Apple presented their draft, which also registered bytestream
protocol. Little did I know that they'd later pivot to WebTransport and abandon the original draft.
As such, I believe that using the extended CONNECT method with :protocol=bytestream
is the correct solution here, we "only" need to register the bytestream
.
The regular CONNECT method cannot be used for proxying TCP over HTTP/2, since it requires server to open a TCP connection to the destination, and not to layer it on top of HTTP/2.
I am not sure I understand this. Its in the HTTP/2 spec https://datatracker.ietf.org/doc/html/rfc7540#section-8.3. I have written multiple HTTP2 CONNECT servers/clients in multiple different languages - all seems fine and well supported. Go and rust's hyper library both have code explicitly to handle HTTP/2 CONNECT. AFAIK Chrome also supports this.
As such, I believe that using the extended CONNECT method with :protocol=bytestream is the correct solution here, we "only" need to register the bytestream.
As far as I know Envoy is the only software that uses this protocol, and there is no standard for this. Is CONNECT support in Envoy intended to be an Envoy specific protocol not interoperable with other softwares?
I am not sure I understand this. Its in the HTTP/2 spec https://datatracker.ietf.org/doc/html/rfc7540#section-8.3. I have written multiple HTTP2 CONNECT servers/clients in multiple different languages - all seems fine and well supported. Go and rust's hyper library both have code explicitly to handle HTTP/2 CONNECT. AFAIK Chrome also supports this.
Yes, regular CONNECT is well supported, but it requires server to establish a TCP connection to the destination, i.e.
[ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP --> [ backend ]
whereas we want to proxy TCP over HTTP/2 across multiple hops, i.e.
[ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> (...)
That is, regular CONNECT is terminating HTTP/2, whereas extended CONNECT is not.
As far as I know Envoy is the only software that uses this protocol, and there is no standard for this. Is CONNECT support in Envoy intended to be an Envoy specific protocol not interoperable with other softwares?
It is meant to be interoperable, but the bytestream
standardization got lost in the process and it should be revived (although, it might be harder to sell now, since they're working on the WebTransport).
@DavidSchinazi ^ Piotr has a valid point for a HTTP2 tunneling proxy. Regular connect implies termination so we can't chain them without violating RFC.
Thanks, I think I finally get why the bytestream
stuff was added, I didn't fully understand until now.
That being said, it still seems like regular CONNECT is useful.
First, isn't this case perfectly valid for Envoy: [ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP --> [ backend ]
Second, what we want is TCP over HTTP/2 across multiple hops, but not all hops. It seems like the concern is the RFC says
A proxy that supports CONNECT establishes a TCP connection [TCP] to the server identified in the ":authority" pseudo-header field.
And we are not establishing a raw TCP connection but rather another TCP-over-HTTP/2 connection?
But in practice, what we want is
[ client ] --- TCP --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ] --- TCP --> [ server ]
Which you could interpret still as violating that part of the RFC. However, I think you could also view it as perfectly fine. The fact we have multiple hops is an implementation detail; the RFC doesn't say that a "proxy" must be a single process or cannot other than establish a raw TCP connection (most do a lot more than this, authz, etc).
So really our diagram could look like this:
[ client ] --- TCP --> [ proxy ] --- TCP over HTTP/2 --> [ "proxy" consisting of multiple Envoy hops of TCP over HTTP/2 ] --- TCP --> [ server ]
Which seems perfectly fine.
Even if this were to not be 110% RFC compliant (not convinced it isn't, but if), in practice this is supported by a ton of systems. The bytestream
is not a standard at all and is not supported by anyone. Using an Envoy specific protocol would not be viable for us.
What is important is that any non-istio proxy can understand the protocol in question. So regular connect is fine at the edges of the mesh for termination, but within the mesh, the expectation is to anticipate another hop. Any other proxy would behave correctly at the edge, but would not do the right thing within the mesh, like Piotr described.
within the mesh, the expectation is to anticipate another hop
At any hop we can make any decision, so I don't think we always know the next hop is HTTP/2 or raw TCP.
Also I don't even agree the spec says the next hop cannot be TCP over HTTP/2
:
The payload of any DATA frames sent by the client is transmitted by the proxy to the TCP server; data received from the TCP server is assembled into DATA frames by the proxy
All it says is the payload must be transmitted over a TCP connection, it doesn't say how. TCP over HTTP/2 is "transmitting the payload over a TCP connection".
Also, my understand of Envoy philosophy is that downstream and upstream decoupled. It seems the concern is about receiving CONNECT downstream and then sending CONNECT upstream as well -- isn't that not really Envoy's choice to make? If we chose to configure that and it violates some RFC, that doesn't seem like Envoy's concern.
All it says is the payload must be transmitted over a TCP connection, it doesn't say how. TCP over HTTP/2 is "transmitting the payload over a TCP connection".
It's pretty clear what the intent is, though.
If we chose to configure that and it violates some RFC, that doesn't seem like Envoy's concern.
So the plan is to move away from the extended CONNECT with :protocol=bytestream
(because it's not standardized and hence not interoperable) to a regular CONNECT for which we plan to violate RFC (so it won't be interoperable at all)?
I am not sure I am following how regular CONNECT can be not interoperable? I can definitely see arguments that it doesn't follow the RFC - but the issue here is we are doing HTTP2 on downstream and upstream, right?
From a downstream client perspective, they send the CONNECT and their TCP payload eventually makes it to where they requests; whether the intermediate steps are regular CONNECT or extended CONNECT or plain TCP or FooBarProtocol is invisible to them. From the upstream server perspective, they just receive a standard CONNECT request - they don't know/care that its the first hop or Nth hop this TCP payload has traversed.
whether the intermediate steps are regular CONNECT or extended CONNECT or plain TCP or FooBarProtocol is invisible to them.
So why exactly do you want to replace extended CONNECT with :protocol=bytestream
in those intermediate steps with something that's even less standardized (or purely wrong from RFC point of view)?
So why exactly do you want to replace extended CONNECT with
:protocol=bytestream
in those intermediate steps with something that's even less standardized (or purely wrong from RFC point of view)?
A few reasons...
Our requirements are to use CONNECT for each hop and to interoperate with non-Envoy clients/servers (Java, Go, etc). These other clients do support :protocol=bytestream
, because its not a standard, but do support regular CONNECT (because it is a standard, and even if our usage is 'wrong', it is invisible to the clients). Our options seem to be to use regular CONNECT or wait many years for a new protocol to be approved and supported added to clients we care about?
- I don't agree with your interpretation of the RFC; IMO its perfectly legal to do what we are expecting.
It's not. Can you point me to a single widely used software that does something else than raw TCP for regular CONNECT?
- If its not, it seems a much more straightforward path forward would be to amend the spec to change the wording to allow this, rather than invent a new protocol which would need support propagated everywhere. AFAIK there is no technical concerns with using standard CONNECT in this way other than the RFC language may say its wrong?
I can guarantee you that it would not be easier, and considering that the core HTTP RFCs just went through a multi-year rewrite, I don't see amending regular CONNECT as a valid option. Also, adding bytestream
to the extended CONNECT method would be pretty trivial as far as writting RFCs goes.
- We still need regular CONNECT for regular CONNECT use cases where we are not doing multi-hops.
Agreed. But adding regular CONNECT in addition to extended CONNECT is not what this bug or @alyssawilk's PR (https://github.com/envoyproxy/envoy/pull/21440) is doing. Both are about replacing extended CONNECT with regular CONNECT.
- If its not meeting the spec exactly, its a violation that is 100% invisible to both upstream and downstream (TcpProxy improperly sets
:protocol=bytestream
for HTTP2 #20378 (comment)).
If we don't care about true interoperability, and only about what's visible at the edges, then again, why is it a problem that Envoy uses extended CONNECT with :protocol=bytestream
that's not fully standardized inside the service mesh?
Our requirements are to use CONNECT for each hop and to interoperate with non-Envoy clients/servers (Java, Go, etc). These other clients do support
:protocol=bytestream
, because its not a standard, but do support regular CONNECT (because it is a standard, and even if our usage is 'wrong', it is invisible to the clients). Our options seem to be to use regular CONNECT or wait many years for a new protocol to be approved and supported added to clients we care about?
If we're only talking about client/servers, then why are you even using CONNECT and not raw TCP?
Agreed. But adding regular CONNECT in addition to extended CONNECT is not what this bug or @alyssawilk's PR (https://github.com/envoyproxy/envoy/pull/21440) is doing. Both are about replacing extended CONNECT with regular CONNECT.
I am perfectly fine with supporting both with some flag.
If we don't care about true interoperability, and only what's visible at the edges, then again, why is it a problem that Envoy uses extended CONNECT with :protocol=bytestream that's not fully standardized inside the service mesh?
Say we have C -> P1 -> P2 -> P3 -> P4 -> S. At each hop, proxy can chose to send CONNECT or raw TCP. Our Envoy hops would send CONNECT in most cases, but non-envoy hops are not required to if they refuse to send CONNECT requests in response to downstream CONNECT requests. Each proxy is 100% independent - they can recieve TCP or CONNECT and send TCP or CONNECT - they don't care what other hops do.
If we're only talking about client/servers, then why are you even using CONNECT and not raw TCP?
We have more than just Envoy's in our mesh, we want to support arbitrary non-Envoy components of the mesh which can use our protocol.
We have more than just Envoy's in our mesh, we want to support arbitrary non-Envoy components of the mesh which can use our protocol.
I think we're going in circles, but how exactly is invalid usage of regular CONNECT better than proper usage of the extended CONNECT with not-standardized :protocol=bytestream
for "your protocol" if you want to be interoperable?
I think we're going in circles
Yep, I think so. Probably best to let @DavidSchinazi weigh in - I think we have both expressed our opinions here.
Can you point me to a single widely used software that does something else than raw TCP for regular CONNECT?
global
log stdout format raw local0
log stdout format raw local1 notice
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend proxy_in
bind 127.0.0.1:8888
option http_proxy
use_backend proxies_out
backend proxies_out
mode http
option forwardfor
#option http_proxy
server ip-1 127.0.0.1:15008 proto h2
HaProxy config. I then have another proxy on 127.0.0.1:15008 (Go)
$ curl https://34.197.247.180:443/get --proxy localhost:8888 -k
{
"url": "https://34.197.247.180/get"
}
So HAProxy is accepting a CONNECT request and sending a CONNECT request in response; this is what we want Envoy to do.
any workaround to apply?
the websocket Request Envoy send to a golang upstream http server always be rejected now with below:
"2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""
Sorry for the delay, I was on vacation last week. My apologies, but @PiotrSikora your understanding of CONNECT is incorrect. Here is the relevant text from the latest spec:
CONNECT is intended for use in requests to a proxy. The recipient can establish a tunnel either by directly connecting to the server identified by the request target or, if configured to use another proxy, by forwarding the CONNECT request to the next inbound proxy. An origin server MAY accept a CONNECT request, but most origin servers do not implement CONNECT.
The only correct and standardized way to tunnel TCP in HTTP is regular CONNECT. The is no Extended CONNECT :protocol
that is defined for proxying TCP.
Or from a different perspective: "extended CONNECT" extends the normal CONNECT, which is intended for proxying HTTPS/TCP with extra functionality, like WS or UDP. It is not deprecating or replacing the original CONNECT, just extends it.
On Tue, May 31, 2022 at 8:21 AM David Schinazi @.***> wrote:
Sorry for the delay, I was on vacation last week. My apologies, but @PiotrSikora https://github.com/PiotrSikora your understanding of CONNECT is incorrect. Here is the relevant text from the latest spec https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-semantics-19#section-9.3.6 :
CONNECT is intended for use in requests to a proxy. The recipient can establish a tunnel either by directly connecting to the server identified by the request target or, if configured to use another proxy, by forwarding the CONNECT request to the next inbound proxy. An origin server MAY accept a CONNECT request, but most origin servers do not implement CONNECT.
The only correct and standardized way to tunnel TCP in HTTP is regular CONNECT. The is no Extended CONNECT :protocol that is defined for proxying TCP.
— Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/20378#issuecomment-1142277475, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAUR2VOSQPLLUKH2MKAEODVMYU5ZANCNFSM5Q42HXRA . You are receiving this because you commented.Message ID: @.***>
any workaround to apply?
the websocket Request Envoy send to a golang upstream http server always be rejected now with below:
"2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""
This is specifically about non-websocket CONNECT usage. For websockets, Envoy is correct - :protocol
must be used for HTTP2 WS. Go doesn't support this, so you should use HTTP1.
any workaround to apply? the websocket Request Envoy send to a golang upstream http server always be rejected now with below: "2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""
This is specifically about non-websocket CONNECT usage. For websockets, Envoy is correct -
:protocol
must be used for HTTP2 WS. Go doesn't support this, so you should use HTTP1.
Thanks Howard! very clear now.
RFC 7540 obseleted by 9113. CONNECT now covered in Section 8.5 https://datatracker.ietf.org/doc/html/rfc9113#section-8.5
@moderation RFC9113 still refers to "TCP streams", so it doesn't clarify things. The draft RFC that David mentioned is an updated HTTP semantics which is decoupled from the underlying transport. I think the concern is that this new HTTP semantics may not be universally supported, but same can be said about Extended CONNECT.
Ah yes. Latest HTTP Semantics RFC 9110 - https://datatracker.ietf.org/doc/html/rfc9110
The latest documents (in particular RFC 9110 - HTTP Semantics and RFC 9113 - HTTP/2) do not actually change the meaning of what was there before, they mainly explain things better. In particular, the meaning of CONNECT
and :protocol
has not changed:
CONNECT
without :protocol
means TCP proxyCONNECT
with :protocol
means other protocols from the HTTP Upgrade Tokens list:protocol=bytestream
is still invalid and shouldn't be used.
TcpProxy improperly sets
:protocol=bytestream
for HTTP2: When usingTunnelingConfig
for an HTTP/2 tunnel, it send an out-of-spec psuedo header.Description:
https://github.com/envoyproxy/envoy/blob/main/source/common/tcp_proxy/upstream.cc#L282-L285
Envoy will always set
:protocol
, but it's not part of the HTTP/2 spec:An extended CONNECT spec does mention it however:
This becomes a problem for picky implementations, like Go/Golang's net/http:
Repro steps:
GODEBUG="http2debug=2"
TunnelingConfig
(UsePost=false)Http2ProtocolOptions
AllowConnect
and astatic
endpoint which is a Golang HTTP2 server.http2: invalid pseudo headers: invalid pseudo-header ":protocol"