envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.99k stars 4.81k forks source link

TcpProxy improperly sets `:protocol=bytestream` for HTTP2 #20378

Closed stevenctl closed 2 years ago

stevenctl commented 2 years ago

TcpProxy improperly sets :protocol=bytestream for HTTP2: When using TunnelingConfig for an HTTP/2 tunnel, it send an out-of-spec psuedo header.

Description:

https://github.com/envoyproxy/envoy/blob/main/source/common/tcp_proxy/upstream.cc#L282-L285

Envoy will always set :protocol, but it's not part of the HTTP/2 spec:

An extended CONNECT spec does mention it however:

This becomes a problem for picky implementations, like Go/Golang's net/http:

Repro steps:

  1. Run. a Go HTTP/2 server with GODEBUG="http2debug=2"
  2. Create a listener with TunnelingConfig (UsePost=false)
  3. Send traffic from the listener to a cluster with Http2ProtocolOptions AllowConnect and a static endpoint which is a Golang HTTP2 server.
  4. You will see http2: invalid pseudo headers: invalid pseudo-header ":protocol"
stevenctl commented 2 years ago

cc @lambdai

ggreenway commented 2 years ago

cc @alyssawilk

lambdai commented 2 years ago

Thank @stevenctl for drafting this.

@alyssawilk I would propose tcp_proxy tunnel doesn't set :protocol by default because tcp_proxy unlikely to proxy websocket

Also, if any protocol override is needed we can relax headers_to_add here.

This relax aligns with other requirements of setting host headers in tunnel. I have some code implementing the relaxation without impacting the HCM router

stevenctl commented 2 years ago

Additional issue:

For HTTP 2 CONNECT rfc7540 8.3:

The ":scheme" and ":path" pseudo-header fields MUST be omitted.

AFAICT I can't use xDS to make Envoy initiate a proper H2 CONNECT.

PiotrSikora commented 2 years ago

I don't believe that your assesment is correct. The :protocol pseudo-header is what differentiates "extended" CONNECT from the "classic" version. Envoy is using the "extended" CONNECT method for proxying TCP over HTTP/2, so it's correctly using it in this context.

It looks that Go doesn't support the "extended" version, and since it's an extension of the HTTP/2 protocol, it should be negotiated using SETTINGS_ENABLE_CONNECT_PROTOCOL between HTTP/2 endpoints (here: Envoy & Go), but it sounds that Envoy might be using it without proper negotiation.

However, you cannot really proxy TCP over HTTP/2 without it (or without reinventing a custom framing or another method for that purpose), so it's unclear what should be the fallback.

costinm commented 2 years ago

I believe there is an option to use POST ( for 'legacy' servers ) - and it may be safer to default to it, unless it is explicitly known that the other side supports extended connect. POST will work with almost all http/2 servers, and has equivalent behavior (connect_config: allow_post: true)

PiotrSikora commented 2 years ago

Right, bytes can be sent using POST (or even GET) method, but that requires prior knowledge.

lambdai commented 2 years ago

The upstream headers are prepared after the connection is established. Theoretically we can relies on the negotiated settings.

I am wondering if peeking the negotiated result is overkill.

costinm commented 2 years ago

I don't know if 'POST requires prior knowledge' - it is the most boring POST request, all servers and clients are supposed to support this without any other special handshake. Using extended CONNECT does require knowing that the other side support the extension - POST doesn't since it's universally supported.

No doubt extended CONNECT is more fancy and likely better matching the use case - but why take the extra complexity when a much simpler option exists that works with all existing infrastructure ?

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

lambdai commented 2 years ago

It's not resolved:

I think we have two approach to connect to a server that is not catching up with the extended CONNECT (https://datatracker.ietf.org/doc/html/rfc8441#section-3)

  1. extend the connpool to support subset LB to consume the peer capability
  2. Assuming we pre-know the upstream http server doesn't support the extended CONNECT, add option to use rfc7540 CONNECT, as well as a better error message to the downstream client. For example. an error code other than 200
howardjohn commented 2 years ago

We did some investigation of the RFCs and I believe this is not the expected protocol to use in this case (or, at least how we want to use CONNECT).

There are a few relevant RFCs here:

In both of these protocols, negotiation is supposed to be done via SETTINGS_ENABLE_CONNECT_PROTOCOL.

We are hoping to proxy application traffic over a CONNECT tunnel (eg app -> envoy proxy, connect encap -> envoy proxy, connect decap --> app). This currently works fine because we have Envoy on each end, but we have experimented with substituting either end with a non-Envoy proxy and see issues there.

alyssawilk commented 2 years ago

cc @wrowe @DavidSchinazi for thoughts

alyssawilk commented 2 years ago

Also was the first link supposed to be https://datatracker.ietf.org/doc/html/rfc8441 ?

DavidSchinazi commented 2 years ago

Setting :protocol=bytestream is incorrect here. If we proxy TCP, then we're using regular CONNECT and not Extended CONNECT. Extended CONNECT exists to support cases that are explicitly not TCP proxying such as WebSocket, see the full list of defined protocols here. The "bytestream" protocol was written up by Apple for an internal use-case they have that does not involve TCP proxying, and they've abandoned that draft since.

PiotrSikora commented 2 years ago

Setting :protocol=bytestream is incorrect here. If we proxy TCP, then we're using regular CONNECT and not Extended CONNECT.

The regular CONNECT method cannot be used for proxying TCP over HTTP/2, since it requires server to open a TCP connection to the destination, and not to layer it on top of HTTP/2.

Extended CONNECT exists to support cases that are explicitly not TCP proxying such as WebSocket, see the full list of defined protocols here. The "bytestream" protocol was written up by Apple for an internal use-case they have that does not involve TCP proxying, and they've abandoned that draft since.

This is missing quite a bit of context.

RFC8441 supports only WebSockets because that was the use case that needed immediate solution at the time. Before it was finalized, I approached Patrick about adding bytestream to support TCP over HTTP/2, and he agreed that it was a good idea, but said that adding it to the draft would delay the standardization process for WebSockets, for which browsers needed solution ASAP. My plan was to write a separate RFC for it, but I stopped that work after Apple presented their draft, which also registered bytestream protocol. Little did I know that they'd later pivot to WebTransport and abandon the original draft.

As such, I believe that using the extended CONNECT method with :protocol=bytestream is the correct solution here, we "only" need to register the bytestream.

howardjohn commented 2 years ago

The regular CONNECT method cannot be used for proxying TCP over HTTP/2, since it requires server to open a TCP connection to the destination, and not to layer it on top of HTTP/2.

I am not sure I understand this. Its in the HTTP/2 spec https://datatracker.ietf.org/doc/html/rfc7540#section-8.3. I have written multiple HTTP2 CONNECT servers/clients in multiple different languages - all seems fine and well supported. Go and rust's hyper library both have code explicitly to handle HTTP/2 CONNECT. AFAIK Chrome also supports this.

As such, I believe that using the extended CONNECT method with :protocol=bytestream is the correct solution here, we "only" need to register the bytestream.

As far as I know Envoy is the only software that uses this protocol, and there is no standard for this. Is CONNECT support in Envoy intended to be an Envoy specific protocol not interoperable with other softwares?

PiotrSikora commented 2 years ago

I am not sure I understand this. Its in the HTTP/2 spec https://datatracker.ietf.org/doc/html/rfc7540#section-8.3. I have written multiple HTTP2 CONNECT servers/clients in multiple different languages - all seems fine and well supported. Go and rust's hyper library both have code explicitly to handle HTTP/2 CONNECT. AFAIK Chrome also supports this.

Yes, regular CONNECT is well supported, but it requires server to establish a TCP connection to the destination, i.e.

[ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP --> [ backend ]

whereas we want to proxy TCP over HTTP/2 across multiple hops, i.e.

[ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> (...)

That is, regular CONNECT is terminating HTTP/2, whereas extended CONNECT is not.

As far as I know Envoy is the only software that uses this protocol, and there is no standard for this. Is CONNECT support in Envoy intended to be an Envoy specific protocol not interoperable with other softwares?

It is meant to be interoperable, but the bytestream standardization got lost in the process and it should be revived (although, it might be harder to sell now, since they're working on the WebTransport).

kyessenov commented 2 years ago

@DavidSchinazi ^ Piotr has a valid point for a HTTP2 tunneling proxy. Regular connect implies termination so we can't chain them without violating RFC.

howardjohn commented 2 years ago

Thanks, I think I finally get why the bytestream stuff was added, I didn't fully understand until now.

That being said, it still seems like regular CONNECT is useful.

First, isn't this case perfectly valid for Envoy: [ client ] --- TCP over HTTP/2 --> [ proxy ] --- TCP --> [ backend ]

Second, what we want is TCP over HTTP/2 across multiple hops, but not all hops. It seems like the concern is the RFC says

A proxy that supports CONNECT establishes a TCP connection [TCP] to the server identified in the ":authority" pseudo-header field.

And we are not establishing a raw TCP connection but rather another TCP-over-HTTP/2 connection?

But in practice, what we want is

[ client ] --- TCP  --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ] --- TCP over HTTP/2 --> [ proxy ]   --- TCP --> [ server ]

Which you could interpret still as violating that part of the RFC. However, I think you could also view it as perfectly fine. The fact we have multiple hops is an implementation detail; the RFC doesn't say that a "proxy" must be a single process or cannot other than establish a raw TCP connection (most do a lot more than this, authz, etc).

So really our diagram could look like this:

[ client ] --- TCP  --> [ proxy ] --- TCP over HTTP/2 --> [ "proxy" consisting of multiple Envoy hops of TCP over HTTP/2 ]   --- TCP --> [ server ]

Which seems perfectly fine.

Even if this were to not be 110% RFC compliant (not convinced it isn't, but if), in practice this is supported by a ton of systems. The bytestream is not a standard at all and is not supported by anyone. Using an Envoy specific protocol would not be viable for us.

kyessenov commented 2 years ago

What is important is that any non-istio proxy can understand the protocol in question. So regular connect is fine at the edges of the mesh for termination, but within the mesh, the expectation is to anticipate another hop. Any other proxy would behave correctly at the edge, but would not do the right thing within the mesh, like Piotr described.

howardjohn commented 2 years ago

within the mesh, the expectation is to anticipate another hop

At any hop we can make any decision, so I don't think we always know the next hop is HTTP/2 or raw TCP.

Also I don't even agree the spec says the next hop cannot be TCP over HTTP/2:

The payload of any DATA frames sent by the client is transmitted by the proxy to the TCP server; data received from the TCP server is assembled into DATA frames by the proxy

All it says is the payload must be transmitted over a TCP connection, it doesn't say how. TCP over HTTP/2 is "transmitting the payload over a TCP connection".

Also, my understand of Envoy philosophy is that downstream and upstream decoupled. It seems the concern is about receiving CONNECT downstream and then sending CONNECT upstream as well -- isn't that not really Envoy's choice to make? If we chose to configure that and it violates some RFC, that doesn't seem like Envoy's concern.

PiotrSikora commented 2 years ago

All it says is the payload must be transmitted over a TCP connection, it doesn't say how. TCP over HTTP/2 is "transmitting the payload over a TCP connection".

It's pretty clear what the intent is, though.

If we chose to configure that and it violates some RFC, that doesn't seem like Envoy's concern.

So the plan is to move away from the extended CONNECT with :protocol=bytestream (because it's not standardized and hence not interoperable) to a regular CONNECT for which we plan to violate RFC (so it won't be interoperable at all)?

howardjohn commented 2 years ago

I am not sure I am following how regular CONNECT can be not interoperable? I can definitely see arguments that it doesn't follow the RFC - but the issue here is we are doing HTTP2 on downstream and upstream, right?

From a downstream client perspective, they send the CONNECT and their TCP payload eventually makes it to where they requests; whether the intermediate steps are regular CONNECT or extended CONNECT or plain TCP or FooBarProtocol is invisible to them. From the upstream server perspective, they just receive a standard CONNECT request - they don't know/care that its the first hop or Nth hop this TCP payload has traversed.

PiotrSikora commented 2 years ago

whether the intermediate steps are regular CONNECT or extended CONNECT or plain TCP or FooBarProtocol is invisible to them.

So why exactly do you want to replace extended CONNECT with :protocol=bytestream in those intermediate steps with something that's even less standardized (or purely wrong from RFC point of view)?

howardjohn commented 2 years ago

So why exactly do you want to replace extended CONNECT with :protocol=bytestream in those intermediate steps with something that's even less standardized (or purely wrong from RFC point of view)?

A few reasons...

  1. I don't agree with your interpretation of the RFC; IMO its perfectly legal to do what we are expecting.
  2. If its not, it seems a much more straightforward path forward would be to amend the spec to change the wording to allow this, rather than invent a new protocol which would need support propagated everywhere. AFAIK there is no technical concerns with using standard CONNECT in this way other than the RFC language may say its wrong?
  3. We still need regular CONNECT for regular CONNECT use cases where we are not doing multi-hops.
  4. If its not meeting the spec exactly, its a violation that is 100% invisible to both upstream and downstream (https://github.com/envoyproxy/envoy/issues/20378#issuecomment-1139782620).

Our requirements are to use CONNECT for each hop and to interoperate with non-Envoy clients/servers (Java, Go, etc). These other clients do support :protocol=bytestream, because its not a standard, but do support regular CONNECT (because it is a standard, and even if our usage is 'wrong', it is invisible to the clients). Our options seem to be to use regular CONNECT or wait many years for a new protocol to be approved and supported added to clients we care about?

PiotrSikora commented 2 years ago
  1. I don't agree with your interpretation of the RFC; IMO its perfectly legal to do what we are expecting.

It's not. Can you point me to a single widely used software that does something else than raw TCP for regular CONNECT?

  1. If its not, it seems a much more straightforward path forward would be to amend the spec to change the wording to allow this, rather than invent a new protocol which would need support propagated everywhere. AFAIK there is no technical concerns with using standard CONNECT in this way other than the RFC language may say its wrong?

I can guarantee you that it would not be easier, and considering that the core HTTP RFCs just went through a multi-year rewrite, I don't see amending regular CONNECT as a valid option. Also, adding bytestream to the extended CONNECT method would be pretty trivial as far as writting RFCs goes.

  1. We still need regular CONNECT for regular CONNECT use cases where we are not doing multi-hops.

Agreed. But adding regular CONNECT in addition to extended CONNECT is not what this bug or @alyssawilk's PR (https://github.com/envoyproxy/envoy/pull/21440) is doing. Both are about replacing extended CONNECT with regular CONNECT.

  1. If its not meeting the spec exactly, its a violation that is 100% invisible to both upstream and downstream (TcpProxy improperly sets :protocol=bytestream for HTTP2 #20378 (comment)).

If we don't care about true interoperability, and only about what's visible at the edges, then again, why is it a problem that Envoy uses extended CONNECT with :protocol=bytestream that's not fully standardized inside the service mesh?

Our requirements are to use CONNECT for each hop and to interoperate with non-Envoy clients/servers (Java, Go, etc). These other clients do support :protocol=bytestream, because its not a standard, but do support regular CONNECT (because it is a standard, and even if our usage is 'wrong', it is invisible to the clients). Our options seem to be to use regular CONNECT or wait many years for a new protocol to be approved and supported added to clients we care about?

If we're only talking about client/servers, then why are you even using CONNECT and not raw TCP?

howardjohn commented 2 years ago

Agreed. But adding regular CONNECT in addition to extended CONNECT is not what this bug or @alyssawilk's PR (https://github.com/envoyproxy/envoy/pull/21440) is doing. Both are about replacing extended CONNECT with regular CONNECT.

I am perfectly fine with supporting both with some flag.

If we don't care about true interoperability, and only what's visible at the edges, then again, why is it a problem that Envoy uses extended CONNECT with :protocol=bytestream that's not fully standardized inside the service mesh?

Say we have C -> P1 -> P2 -> P3 -> P4 -> S. At each hop, proxy can chose to send CONNECT or raw TCP. Our Envoy hops would send CONNECT in most cases, but non-envoy hops are not required to if they refuse to send CONNECT requests in response to downstream CONNECT requests. Each proxy is 100% independent - they can recieve TCP or CONNECT and send TCP or CONNECT - they don't care what other hops do.

If we're only talking about client/servers, then why are you even using CONNECT and not raw TCP?

We have more than just Envoy's in our mesh, we want to support arbitrary non-Envoy components of the mesh which can use our protocol.

PiotrSikora commented 2 years ago

We have more than just Envoy's in our mesh, we want to support arbitrary non-Envoy components of the mesh which can use our protocol.

I think we're going in circles, but how exactly is invalid usage of regular CONNECT better than proper usage of the extended CONNECT with not-standardized :protocol=bytestream for "your protocol" if you want to be interoperable?

howardjohn commented 2 years ago

I think we're going in circles

Yep, I think so. Probably best to let @DavidSchinazi weigh in - I think we have both expressed our opinions here.

Can you point me to a single widely used software that does something else than raw TCP for regular CONNECT?

global 
 log stdout format raw local0
 log stdout format raw local1 notice
 daemon 
 maxconn 256 

defaults 
 mode http 
 timeout connect 5000ms 
 timeout client 50000ms 
 timeout server 50000ms 

frontend proxy_in 
 bind 127.0.0.1:8888 
 option http_proxy
 use_backend proxies_out

backend proxies_out 
 mode http
 option forwardfor
 #option http_proxy
 server ip-1 127.0.0.1:15008 proto h2

HaProxy config. I then have another proxy on 127.0.0.1:15008 (Go)

$ curl https://34.197.247.180:443/get --proxy localhost:8888 -k
{
  "url": "https://34.197.247.180/get"
}

So HAProxy is accepting a CONNECT request and sending a CONNECT request in response; this is what we want Envoy to do.

mwangtarget commented 2 years ago

any workaround to apply?

the websocket Request Envoy send to a golang upstream http server always be rejected now with below:

"2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""

DavidSchinazi commented 2 years ago

Sorry for the delay, I was on vacation last week. My apologies, but @PiotrSikora your understanding of CONNECT is incorrect. Here is the relevant text from the latest spec:

CONNECT is intended for use in requests to a proxy. The recipient can establish a tunnel either by directly connecting to the server identified by the request target or, if configured to use another proxy, by forwarding the CONNECT request to the next inbound proxy. An origin server MAY accept a CONNECT request, but most origin servers do not implement CONNECT.

The only correct and standardized way to tunnel TCP in HTTP is regular CONNECT. The is no Extended CONNECT :protocol that is defined for proxying TCP.

costinm commented 2 years ago

Or from a different perspective: "extended CONNECT" extends the normal CONNECT, which is intended for proxying HTTPS/TCP with extra functionality, like WS or UDP. It is not deprecating or replacing the original CONNECT, just extends it.

On Tue, May 31, 2022 at 8:21 AM David Schinazi @.***> wrote:

Sorry for the delay, I was on vacation last week. My apologies, but @PiotrSikora https://github.com/PiotrSikora your understanding of CONNECT is incorrect. Here is the relevant text from the latest spec https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-semantics-19#section-9.3.6 :

CONNECT is intended for use in requests to a proxy. The recipient can establish a tunnel either by directly connecting to the server identified by the request target or, if configured to use another proxy, by forwarding the CONNECT request to the next inbound proxy. An origin server MAY accept a CONNECT request, but most origin servers do not implement CONNECT.

The only correct and standardized way to tunnel TCP in HTTP is regular CONNECT. The is no Extended CONNECT :protocol that is defined for proxying TCP.

— Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/20378#issuecomment-1142277475, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAUR2VOSQPLLUKH2MKAEODVMYU5ZANCNFSM5Q42HXRA . You are receiving this because you commented.Message ID: @.***>

howardjohn commented 2 years ago

any workaround to apply?

the websocket Request Envoy send to a golang upstream http server always be rejected now with below:

"2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""

This is specifically about non-websocket CONNECT usage. For websockets, Envoy is correct - :protocol must be used for HTTP2 WS. Go doesn't support this, so you should use HTTP1.

mwangtarget commented 2 years ago

any workaround to apply? the websocket Request Envoy send to a golang upstream http server always be rejected now with below: "2022/05/31 21:18:40 http2: invalid pseudo headers: invalid pseudo-header ":protocol""

This is specifically about non-websocket CONNECT usage. For websockets, Envoy is correct - :protocol must be used for HTTP2 WS. Go doesn't support this, so you should use HTTP1.

Thanks Howard! very clear now.

moderation commented 2 years ago

RFC 7540 obseleted by 9113. CONNECT now covered in Section 8.5 https://datatracker.ietf.org/doc/html/rfc9113#section-8.5

kyessenov commented 2 years ago

@moderation RFC9113 still refers to "TCP streams", so it doesn't clarify things. The draft RFC that David mentioned is an updated HTTP semantics which is decoupled from the underlying transport. I think the concern is that this new HTTP semantics may not be universally supported, but same can be said about Extended CONNECT.

moderation commented 2 years ago

Ah yes. Latest HTTP Semantics RFC 9110 - https://datatracker.ietf.org/doc/html/rfc9110

alyssawilk commented 2 years ago

https://github.com/envoyproxy/envoy/pull/21613 thanks!

DavidSchinazi commented 2 years ago

The latest documents (in particular RFC 9110 - HTTP Semantics and RFC 9113 - HTTP/2) do not actually change the meaning of what was there before, they mainly explain things better. In particular, the meaning of CONNECT and :protocol has not changed:

:protocol=bytestream is still invalid and shouldn't be used.