EDS deduping endpoints with same address but different metadata

stevenctl commented 2 years ago

Title: EDS deduping endpoints with same address but different metadata

Description:

When testing LB behavior and config_dump?include_eds for an EDS cluster with endpoints that have the same address, but different metadata I see that only one endpoint is actually preserved. This behavior makes sense to me, but I can't find documentation that says it should be happening. Assuming it's not correct, the goal is to send traffic to the same internal listener and then using a custom filter on that internal listener extract destination info from passthrough metadata.

Notes:

I have not tested this with socketaddress
STATIC clusters with similar sets of endpoints seem to be ok cc @kyessenov

Repro steps:

Using the static config below, and a modified example xds server that just returns endpoints for "outbound_service"

static config (admin on 15000, points to xds on 15012)

```yaml node: id: test-id cluster: outbound_service admin: address: socket_address: { address: 127.0.0.1, port_value: 15000 } bootstrap_extensions: - name: envoy.bootstrap.internal_listener typed_config: "@type": type.googleapis.com/envoy.extensions.bootstrap.internal_listener.v3.InternalListener layered_runtime: layers: - name: global config static_layer: envoy.reloadable_features.internal_address: 'true' static_resources: listeners: - name: outbound_capture use_original_dst: true address: socket_address: { address: 127.0.0.1, port_value: 15001 } filter_chains: - filters: - name: envoy.filters.network.tcp_proxy typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy stat_prefix: ingress_tcp cluster: outbound_service - name: tunnel_lis internal_listener: {} address: envoy_internal_address: server_listener_name: tunnel_lis filter_chains: - filters: - name: envoy.filters.network.tcp_proxy typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy stat_prefix: ingress_tcp cluster: outbound_tunnel clusters: - name: outbound_service type: EDS lb_policy: ROUND_ROBIN eds_cluster_config: eds_config: resource_api_version: V3 api_config_source: api_type: GRPC transport_api_version: V3 grpc_services: - envoy_grpc: cluster_name: xds_cluster - name: outbound_tunnel # using a custom filter to set original dst type: ORIGINAL_DST lb_policy: CLUSTER_PROVIDED - name: xds_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN typed_extension_protocol_options: envoy.extensions.upstreams.http.v3.HttpProtocolOptions: "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions explicit_http_config: http2_protocol_options: connection_keepalive: interval: 30s timeout: 5s upstream_connection_options: # configure a TCP keep-alive to detect and reconnect to the admin # server in the event of a TCP socket half open connection tcp_keepalive: {} load_assignment: cluster_name: xds_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: 127.0.0.1 port_value: 15012 ```

Admin and Stats Output:

I should have two endpoints, with upstream 0 and 1 in metadata


"lb_endpoints": [
         {
          "endpoint": {
           "address": {
            "envoy_internal_address": {
             "server_listener_name": "tunnel_lis"
            }
           },
           "health_check_config": {}
          },
          "health_status": "HEALTHY",
          "metadata": {
           "filter_metadata": {
            "tunnel": {
             "upstream": 0
            }
           }
          },
          "load_balancing_weight": 1
         }
        ]

stevenctl commented 2 years ago

https://github.com/envoyproxy/envoy/blob/087422b811758f3acc8382ef92582d634aa71d9c/source/common/upstream/eds.cc#L136

Looks like this is intentional. Can we make an exception for internal address? Or will that cause other issues.

kyessenov commented 2 years ago

Yes, I think an exception makes sense because internal addresses are not unique between the endpoints because of the extra passthrough state from metadata. So maybe just put a guard there that skips internal addresses? WDYT @adisuissa

kyessenov commented 2 years ago

Fixed by #22274 .

lobkovilya commented 1 year ago

@kyessenov https://github.com/envoyproxy/envoy/pull/22274 allows duplication only for internal endpoints, while the issue states a wider problem with EDS deduping endpoints.

For us, it makes sense to have endpoints with the same address and port but different metadata. Using TransportSocketMatches we set different SNI based on the endpoint's metadata. But the addresses of endpoints can be similar (because they're pointing to the same ingress/egress envoy-based proxy).

Do you think we can reopen the issue to continue the discussion?

lobkovilya commented 1 year ago

Thank you! As I already mentioned it would make sense for us to have endpoints that differ only in metadata. Do you know if there are any fundamental limitations in Envoy that wouldn't allow us to get rid of the deduplication?

kyessenov commented 1 year ago

I see how it could be useful to balance between multiple transports to the same endpoint. The core limitation is in diffing algorithms within EDS/load balancers. When an update arrives, Envoy needs to take a diff with the current set, and it does it by address comparison (things like re-weighting, drainage).

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

lobkovilya commented 1 year ago

@kyessenov could you please mark this as help-wanted?

envoyproxy / envoy

EDS deduping endpoints with same address but different metadata #22417