envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.86k stars 4.78k forks source link

Envoy doesn’t increase rq_xx metrics when upstream_reset_before_response_started occured #12986

Open gawarz opened 4 years ago

gawarz commented 4 years ago

Title: Envoy doesn’t increase rq_xx metrics when upstream_reset_before_response_started occured

Description:

We send all requests to service via envoy. Clients receive responses with 503 http status. These responses aren’t returned by service and there are no metrics increased for these responses.When log level is set up to debug and access log is enabled we can see reason of this behavior: Sending local reply with details upstream_reset_before_response_started{connection failure} Code responsible for handling early upstream reset: https://github.com/envoyproxy/envoy/blob/bd2b989c578b2472faaff44902573e5b187f671f/source/common/router/router.cc#L1067-L1071 Expected behavior:

Envoy should increase the following metrics: envoy_cluster_upstream_rq_xx{envoy_cluster_name="local_service" … } envoy_http_downstream_rq_xx{envoy_http_conn_manager_prefix="ingress_http” ...}

I attached envoy logs and configuration for local_service,ingress_listener. Sensitive data has been redacted.

Config:

{
  "@type": "type.googleapis.com/envoy.admin.v3.ClustersConfigDump",
  "version_info": "b3285327106747cfbe2dea29bb588773",
  "static_clusters": [
  {
    "cluster": {
      "@type": "type.googleapis.com/envoy.api.v2.Cluster",
      "name": "local_service",
      "type": "STATIC",
      "connect_timeout": "1s",
      "circuit_breakers": {
        "thresholds": [
        {
          "max_connections": 4096,
          "max_requests": 4096
        },
        {
          "priority": "HIGH",
          "max_connections": 4096,
          "max_requests": 4096
        }
        ]
      },
      "load_assignment": {
        "cluster_name": "local_service",
        "endpoints": [
        {
          "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "127.0.0.1",
                  "port_value": 31359
                }
              }
            }
          }
          ]
        }
        ]
      }
    },
    "last_updated": "2020-08-31T13:38:08.692Z"
  }
  ]

  {
    "@type": "type.googleapis.com/envoy.admin.v3.ListenersConfigDump",
    "version_info": "052d73809a554b9892bd66dcefd107d6",
    "dynamic_listeners": [
    {
      "name": "ingress_listener",
      "active_state": {
        "version_info": "052d73809a554b9892bd66dcefd107d6",
        "listener": {
          "@type": "type.googleapis.com/envoy.api.v2.Listener",
          "name": "ingress_listener",
          "address": {
            "socket_address": {
              "address": "0.0.0.0",
              "port_value": 31361
            }
          },
          "filter_chains": [
          {
            "filter_chain_match": {
              "transport_protocol": "raw_buffer"
            },
            "filters": [
            {
              "name": "envoy.http_connection_manager",
              "typed_config": {
                "@type": "type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager",
                "stat_prefix": "ingress_http",
                "rds": {
                  "config_source": {
                    "ads": {},
                    "initial_fetch_timeout": "30s"
                  },
                  "route_config_name": "ingress_secured_routes"
                },
                "http_filters": [
                {
                  "name": "envoy.router"
                }
                ],
                "http_protocol_options": {
                  "accept_http_10": true,
                  "default_host_for_http_10": "example-service"
                },
                "use_remote_address": true,
                "xff_num_trusted_hops": 1,
                "delayed_close_timeout": "0s",
                "common_http_protocol_options": {
                  "idle_timeout": "120s"
                }
              }
            }
            ]
          }
          ]
        },
        "last_updated": "2020-08-31T13:38:08.779Z"
      }
    }
    ]

Logs:

2020-08-28 10:29:56.195 DEBUG   [source/common/http/conn_manager_impl.cc:1710] [C10527][S15611812368749396950] encoding headers via codec (end_stream=false):
':status', '503'
'content-length', '91'
'content-type', 'text/plain'
'date', 'Fri, 28 Aug 2020 08:29:55 GMT'
'server', 'envoy'

2020-08-28 10:29:56.195 DEBUG   [source/common/router/router.cc:1018] [C10527][S15611812368749396950] upstream reset: reset reason connection failure
2020-08-28 10:29:56.195 DEBUG   [source/common/http/conn_manager_impl.cc:1479] [C10527][S15611812368749396950] Sending local reply with details upstream_reset_before_response_started{connection failure}
2020-08-28 10:29:56.171 DEBUG   [source/common/router/router.cc:1390] [C10527][S15611812368749396950] performing retry
2020-08-28 10:29:56.171 DEBUG   [source/common/router/router.cc:1018] [C10527][S15611812368749396950] upstream reset: reset reason connection failure
2020-08-28 10:29:56.170 DEBUG   [source/common/router/router.cc:634] [C10527][S15611812368749396950] router decoding headers:
':path', '/example/path'
':method', 'POST'
'x-envoy-external-address', '127.0.0.1'
'accept', 'application/json'
'content-type', 'application/json; charset=utf-8'
'x-envoy-expected-rq-timeout-ms', '900000'

2020-08-28 10:29:56.170 DEBUG   [source/common/http/conn_manager_impl.cc:782] [C10527][S15611812368749396950] request headers complete (end_stream=false):
':path', '/example/path'
':method', 'POST'
'x-envoy-external-address', '1.2.3.4'
'accept', 'application/json'
'x-envoy-expected-rq-timeout-ms', '900000'
'content-type', 'application/json; charset=utf-8'

2020-08-28 10:29:56.170 DEBUG   [source/common/router/router.cc:477] [C10527][S15611812368749396950] cluster 'local_service' match for URL '/example/path'
2020-08-28 10:29:56.170 DEBUG   [source/common/http/conn_manager_impl.cc:1337] [C10527][S15611812368749396950] request end stream
nlamirault commented 1 year ago

Any news on this bugfix @snowp ?