envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.96k stars 4.8k forks source link

Circuit breakers trigger outlier detection #25487

Closed Hexta closed 1 year ago

Hexta commented 1 year ago

Title: Circuit breakers trigger outlier detection

Description: Max requests circuit breaker triggers outlier detections to eject unrelated endpoints.

Repro steps:

Admin and Stats Output:

cluster.a.outlier_detection.ejections_active: 1
cluster.a.outlier_detection.ejections_consecutive_5xx: 0
cluster.a.outlier_detection.ejections_detected_consecutive_5xx: 0
cluster.a.outlier_detection.ejections_detected_consecutive_gateway_failure: 0
cluster.a.outlier_detection.ejections_detected_consecutive_local_origin_failure: 1
cluster.a.outlier_detection.ejections_detected_failure_percentage: 0
cluster.a.outlier_detection.ejections_detected_local_origin_failure_percentage: 0
cluster.a.outlier_detection.ejections_detected_local_origin_success_rate: 0
cluster.a.outlier_detection.ejections_detected_success_rate: 0
cluster.a.outlier_detection.ejections_enforced_consecutive_5xx: 0
cluster.a.outlier_detection.ejections_enforced_consecutive_gateway_failure: 0
cluster.a.outlier_detection.ejections_enforced_consecutive_local_origin_failure: 1
cluster.a.outlier_detection.ejections_enforced_failure_percentage: 0
cluster.a.outlier_detection.ejections_enforced_local_origin_failure_percentage: 0
cluster.a.outlier_detection.ejections_enforced_local_origin_success_rate: 0
cluster.a.outlier_detection.ejections_enforced_success_rate: 0
cluster.a.outlier_detection.ejections_enforced_total: 1
cluster.a.outlier_detection.ejections_overflow: 0
cluster.a.outlier_detection.ejections_success_rate: 0
cluster.a.outlier_detection.ejections_total: 0

Config:

resources:
  - "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
    name: a
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_requests: 0

    outlier_detection:
      interval: 1s

      consecutive_local_origin_failure: 1
      enforcing_consecutive_local_origin_failure: 100
      split_external_local_origin_errors: true

      base_ejection_time: 200s
      max_ejection_time: 200s
    load_assignment:
      cluster_name: a

      endpoints:
        - priority: 0
          lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: 127.0.0.1
                    port_value: 9900

Logs:

[16][debug][pool] [source/common/http/conn_pool_base.cc:78] queueing stream due to no available connections (ready=0 busy=0 connecting=0)
[16][debug][pool] [source/common/conn_pool/conn_pool_base.cc:291] trying to create new connection
[16][debug][pool] [source/common/conn_pool/conn_pool_base.cc:145] creating a new connection (connecting=0)
[16][debug][connection] [./source/common/network/connection_impl.h:92] [C1] current connecting state: true
[16][debug][client] [source/common/http/codec_client.cc:57] [C1] connecting
[16][debug][connection] [source/common/network/connection_impl.cc:939] [C1] connecting to 127.0.0.1:9900
[16][debug][connection] [source/common/network/connection_impl.cc:958] [C1] connection in progress
[16][debug][connection] [source/common/network/connection_impl.cc:688] [C1] connected
[16][debug][client] [source/common/http/codec_client.cc:88] [C1] connected
[16][debug][pool] [source/common/conn_pool/conn_pool_base.cc:328] [C1] attaching to next stream
[16][debug][pool] [source/common/conn_pool/conn_pool_base.cc:176] max streams overflow
[16][debug][router] [source/common/router/router.cc:1208] [C0][S18144849846439401274] upstream reset: reset reason: overflow, transport failure reason: 
[9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1195] membership update for TLS cluster a added 0 removed 0
[9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:878] host 127.0.0.1:9900 in cluster a was ejected by the outlier detector
hridhar commented 2 months ago

front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][connection] [./source/common/network/connection_impl.h:98] [C37] current connecting state: true front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][client] [source/common/http/codec_client.cc:57] [C37] connecting front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][connection] [source/common/network/connection_impl.cc:941] [C37] connecting to 142.250.76.74:443 front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][connection] [source/common/network/connection_impl.cc:960] [C37] connection in progress front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][jwt] [source/extensions/filters/http/jwt_authn/filter.cc:97] Called Filter : decodeHeaders Stop front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][jwt] [source/extensions/filters/http/jwt_authn/filter.cc:141] Called Filter : decodeData front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][http] [source/common/http/conn_manager_impl.cc:1101] [C26][S15583304669476707722] request end stream front-envoy_1 | [2024-08-08 06:29:08.846][20][debug][jwt] [source/extensions/filters/http/jwt_authn/filter.cc:141] Called Filter : decodeData front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][pool] [source/common/conn_pool/conn_pool_base.cc:793] [C29] connect timeout front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][connection] [source/common/network/connection_impl.cc:139] [C29] closing data_to_write=0 type=1 front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][connection] [source/common/network/connection_impl.cc:250] [C29] closing socket: 1 front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][client] [source/common/http/codec_client.cc:107] [C29] disconnect. resetting 0 pending requests front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][pool] [source/common/conn_pool/conn_pool_base.cc:484] [C29] client disconnected, failure reason: front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][router] [source/common/router/router.cc:1279] [C0][S3681161906323866553] upstream reset: reset reason: connection failure, transport failure reason: front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][http] [source/common/http/async_client_impl.cc:123] async http request response headers (end_stream=false): front-envoy_1 | ':status', '503' front-envoy_1 | 'content-length', '91' front-envoy_1 | 'content-type', 'text/plain' front-envoy_1 | front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][filter] [source/extensions/filters/http/common/jwks_fetcher.cc:103] onSuccess: fetch pubkey [uri = https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com]: response status code 503 front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][jwt] [source/extensions/filters/http/jwt_authn/authenticator.cc:374] firebase_jwt: JWT token verification completed with: Jwks remote fetch is failed front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][jwt] [source/extensions/filters/http/jwt_authn/authenticator.cc:378] status is: Jwks remote fetch is failed front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][jwt] [source/extensions/filters/http/jwt_authn/filter.cc:109] Jwt authentication completed with: Jwks remote fetch is failed front-envoy_1 | [2024-08-08 06:29:08.864][26][debug][http] [source/common/http/filter_manager.cc:996] [C28][S15202572061363395092] Sending local reply with details