envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.08k stars 4.82k forks source link

Cluster: fallback to HTTP/1.1 Websocket in auto_config when RFC 8441 ws over HTTP/2 negotiation fails (or always permit websocket over h1) #37020

Open bmcalary-atlassian opened 2 weeks ago

bmcalary-atlassian commented 2 weeks ago

Description: We have the following cluster configuration to ensure that HTTP/2 is used when signaled via ALPN, while still retaining the ability to fall back to HTTP/1.1 if HTTP/2 is not in the ALPN. This simplifies our xDS in a environment with mixed backends.

      "typed_extension_protocol_options": {
        "envoy.extensions.upstreams.http.v3.HttpProtocolOptions": {
          "@type": "type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions",
          "common_http_protocol_options": {
            "idle_timeout": "55s"
          },
          "auto_config": {
            "http_protocol_options": {
              "header_key_format": {
                "proper_case_words": {
                }
              }
            },
            "http2_protocol_options": {
              "max_concurrent_streams": 128,
              "initial_stream_window_size": 1048576,
              "allow_connect": true
            }
          }
        }
      },

You'll see that allow_connect is present, which enables support for RFC 8441 (websocket over http/2 via extended connect), which expects the backend to send SETTINGS_ENABLE_CONNECT_PROTOCOL=1.

We find that if the backend DOES support HTTP/2 but does NOT support RFC 8441 SETTINGS_ENABLE_CONNECT_PROTOCOL=1 (e.g. AWS ALB) Envoy does not gracefully fall back to using HTTP/1.1 for the websocket, and the connections fail. The only recourse is to disable HTTP/2 for the cluster entirely.

Even if allow_connect is false, Envoy fails to proxy the Websocket connection if HTTP/2 is supported in ALPN.

It appears Envoy lacks logic to infer that a Websocket connection with allow_connect: false + ALPN h2 http1.1, or with allow_connect :true + ALPN h2 http1.1 + a backend that does not signal SETTINGS_ENABLE_CONNECT_PROTOCOL=1, would succeed if HTTP/1.1 were used.

We suggest the following solutions:

[optional Relevant Links:] https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto#envoy-v3-api-msg-config-core-v3-http2protocoloptions https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/http/upgrades.html https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto#envoy-v3-api-msg-config-core-v3-http2protocoloptions

KBaichoo commented 2 weeks ago

cc @alyssawilk

alyssawilk commented 1 week ago

It's too bad the protocol negotiation doesn't include feature negotiation or at least I don't know of any mechanism or plans to add one (cc @RyanTheOptimist ) I think it's a reasonable feature but I think you'd have to add support yourselves.

bmcalary-atlassian commented 1 week ago

@alyssawilk do you have a option preference by the way?

I personally prefer option 2: always_use_h1_for_websocket: true since it seems more definitive and possibly easier to implement.

I worry that option 1 and 3 introduce a round-trip to learn about SETTINGS_ENABLE_CONNECT_PROTOCOL support. And would be more complex to implement. Even if they would be more intelligent ultimately.

alyssawilk commented 1 week ago

no preference - I'd take it up with API shepherds, but I wonder if it'd make sense to make it more generic since for H2 it's all connect requests not just websocket right?