Expose envoy TCP statistics for both upstreams and downstreams

johnharris85 commented 1 year ago

Description

We should be able to gather metrics on how the underlying network is performing to understand how this is impacting requests in our mesh gateway.

We would like the statistics listed here: https://www.envoyproxy.io/docs/envoy/latest/configuration/upstream/cluster_manager/cluster_stats#tcp-statistics exposed for both upstreams and downstreams by optionally wrapping the DownstreamTlsContext and UpstreamTlsContext definitions. This only works on Linux so this would likely need to be customisable via a MeshGatewayInstance policy.

For example for a downstream socket:

{
   "transport_socket":{
      "name":"envoy.transport_sockets.downstream",
      "typed_config":{
         "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tcp_stats.v3.Config",
         "update_period":"5s",
         "transport_socket":{
            "name":"envoy.transport_sockets.tls",
            "typed_config":{
               "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext",
               "common_tls_context":{
                  "tls_certificates":[
                     {
                        "certificate_chain":{
                           "inline_bytes":"..."
                        },
                        "private_key":{
                           "inline_bytes":"..."
                        }
                     }
                  ],
                  "validation_context":{
                     "trusted_ca":{
                        "inline_bytes":"..."
                     },
                     "match_subject_alt_names":[
                        {
                           "exact":"kuma-cp"
                        }
                     ]
                  }
               },
               "require_client_certificate":true
            }
         }
      }
   }
}

jakubdyszkiewicz commented 1 year ago

Triage: We can potentially put it in a new MeshTrafficMetrics policy #5708