grpc / grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
https://grpc.io
Apache License 2.0
21.06k stars 4.37k forks source link

Flaky Test (8/100k): RingHash_ReattemptWhenGoingFromTransientFailureToIdle #7464

Closed easwars closed 4 days ago

easwars commented 3 months ago

https://github.com/grpc/grpc-go/actions/runs/10063024287/job/27816861885?pr=7430

--- FAIL: Test (12.91s)
    --- FAIL: Test/RingHash_ReattemptWhenGoingFromTransientFailureToIdle (0.08s)
        ringhash_balancer_test.go:1522: Created new snapshot cache...
        tlogger.go:116: INFO server.go:685 [core] [Server #547]Server created  (t=+1.309591ms)
        ringhash_balancer_test.go:1522: Registered Aggregated Discovery Service (ADS)...
        ringhash_balancer_test.go:1522: xDS management server serving at: 127.0.0.1:41641...
        server.go:229: Created new resource snapshot...
        server.go:235: Updated snapshot cache with resource snapshot...
        tlogger.go:116: INFO clientconn.go:1683 [core] original dial target is: "xds:///test.server"  (t=+2.954029ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #548]Channel created  (t=+3.010395ms)
        tlogger.go:116: INFO clientconn.go:192 [core] [Channel #548]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"xds", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/test.server", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}  (t=+3.092208ms)
        tlogger.go:116: INFO clientconn.go:193 [core] [Channel #548]Channel authority set to "test.server"  (t=+3.143404ms)
        tlogger.go:116: INFO server.go:881 [core] [Server #547 ListenSocket #549]ListenSocket created  (t=+3.301299ms)
        tlogger.go:116: INFO xds_resolver.go:86 [xds] [xds-resolver 0xc00103c240] Creating resolver for target: xds:///test.server  (t=+3.398251ms)
        tlogger.go:116: INFO client_refcounted.go:81 [xds] [xds-client 0xc000c363c0] Created client with name "xds:///test.server" to primary xDS management server: "passthrough:///127.0.0.1:41641-insecure"  (t=+4.02296ms)
        tlogger.go:116: INFO client_refcounted.go:86 [xds] xDS node ID: 4d0a6692-9942-4553-9a52-876e89657de8  (t=+4.09212ms)
        tlogger.go:116: INFO clientconn.go:1683 [core] original dial target is: "passthrough:///127.0.0.1:41641"  (t=+4.171568ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #550]Channel created  (t=+4.224939ms)
        tlogger.go:116: INFO clientconn.go:192 [core] [Channel #550]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"passthrough", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/127.0.0.1:41641", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}  (t=+4.296362ms)
        tlogger.go:116: INFO clientconn.go:193 [core] [Channel #550]Channel authority set to "127.0.0.1:41641"  (t=+4.352016ms)
        tlogger.go:116: INFO resolver_wrapper.go:197 [core] [Channel #550]Resolver state updated: {
              "Addresses": [
                {
                  "Addr": "127.0.0.1:41641",
                  "ServerName": "",
                  "Attributes": null,
                  "BalancerAttributes": null,
                  "Metadata": null
                }
              ],
              "Endpoints": [
                {
                  "Addresses": [
                    {
                      "Addr": "127.0.0.1:41641",
                      "ServerName": "",
                      "Attributes": null,
                      "BalancerAttributes": null,
                      "Metadata": null
                    }
                  ],
                  "Attributes": null
                }
              ],
              "ServiceConfig": null,
              "Attributes": null
            } (resolver returned new addresses)  (t=+4.587727ms)
        tlogger.go:116: INFO balancer_wrapper.go:106 [core] [Channel #550]Channel switches to new LB policy "pick_first"  (t=+4.709425ms)
        tlogger.go:116: INFO clientconn.go:847 [core] [Channel #550 SubChannel #551]Subchannel created  (t=+4.791308ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #550]Channel Connectivity change to CONNECTING  (t=+4.858024ms)
        tlogger.go:116: INFO clientconn.go:1[207](https://github.com/grpc/grpc-go/actions/runs/10063024287/job/27816861885?pr=7430#step:8:208) [core] [Channel #550 SubChannel #551]Subchannel Connectivity change to CONNECTING  (t=+4.957991ms)
        tlogger.go:116: INFO clientconn.go:1325 [core] [Channel #550 SubChannel #551]Subchannel picks a new address "127.0.0.1:41641" to connect  (t=+5.014947ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #550]Channel exiting idle mode  (t=+5.469658ms)
        tlogger.go:116: INFO transport.go:237 [xds] [xds-client 0xc000c363c0] [passthrough:///127.0.0.1:41641] Created transport to server "passthrough:///127.0.0.1:41641"  (t=+5.550189ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #548]Channel exiting idle mode  (t=+5.711882ms)
        tlogger.go:116: INFO clientconn.go:1207 [core] [Channel #550 SubChannel #551]Subchannel Connectivity change to READY  (t=+6.338214ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #550]Channel Connectivity change to READY  (t=+6.413635ms)
        tlogger.go:116: INFO transport.go:338 [xds] [xds-client 0xc000c363c0] [passthrough:///127.0.0.1:41641] ADS stream created  (t=+6.491641ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.listener.v3.Listener[test.server] and known map[]. Diff [test.server]
        logging.go:30: respond type.googleapis.com/envoy.config.listener.v3.Listener[test.server] version "" with version "1"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.listener.v3.Listener[test.server] and known map[test.server:{}]. Diff []
        logging.go:30: open watch 1 for type.googleapis.com/envoy.config.listener.v3.Listener[test.server] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "1"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] and known map[]. Diff [new_route]
        logging.go:30: respond type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] version "" with version "1"
        tlogger.go:116: INFO xds_resolver.go:270 [xds] [xds-resolver 0xc00103c240] For Listener resource "test.server" and RouteConfiguration resource "new_route", generated service config: {
              "loadBalancingConfig": [
                {
                  "xds_cluster_manager_experimental": {
                    "children": {
                      "cluster:cluster": {
                        "childPolicy": [
                          {
                            "cds_experimental": {
                              "cluster": "cluster"
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              ]
            }  (t=+8.84713ms)
        tlogger.go:116: INFO resolver_wrapper.go:197 [core] [Channel #548]Resolver state updated: {
              "Addresses": null,
              "Endpoints": [],
              "ServiceConfig": {
                "Config": {
                  "Config": null,
                  "Methods": {}
                },
                "Err": null
              },
              "Attributes": {
                "\u003c%!p(resolver.csKeyType=grpc.internal.resolver.configSelector)\u003e": "\u003c0xc000b36aa0\u003e",
                "\u003c%!p(xdsclient.clientKeyType=grpc.xds.internal.client.Client)\u003e": "\u003c0xc000c42910\u003e"
              }
            } (service config updated)  (t=+9.241649ms)
        tlogger.go:116: INFO balancer_wrapper.go:106 [core] [Channel #548]Channel switches to new LB policy "xds_cluster_manager_experimental"  (t=+9.346796ms)
        tlogger.go:116: INFO clustermanager.go:59 [xds] [xds-cluster-manager-lb 0xc00016ab60] Created  (t=+9.404003ms)
        tlogger.go:116: INFO clustermanager.go:174 [xds] [xds-cluster-manager-lb 0xc00016ab60] Update with config {
              "LoadBalancingConfig": null,
              "Children": {
                "cluster:cluster": {
                  "ChildPolicy": [
                    {
                      "cds_experimental": {
                        "LoadBalancingConfig": null,
                        "Cluster": "cluster"
                      }
                    }
                  ]
                }
              }
            }, resolver state {Addresses:[] Endpoints:[] ServiceConfig:0xc00016a840 Attributes:{"<%!p(resolver.csKeyType=grpc.internal.resolver.configSelector)>": "<0xc000b36aa0>" , "<%!p(xdsclient.clientKeyType=grpc.xds.internal.client.Client)>": "<0xc000c42910>" }}  (t=+9.561247ms)
        tlogger.go:116: INFO balancergroup.go:305 [xds] [xds-cluster-manager-lb 0xc00016ab60] Adding child policy of type "cds_experimental" for locality "cluster:cluster"  (t=+9.627281ms)
        tlogger.go:116: INFO balancergroup.go:102 [xds] [xds-cluster-manager-lb 0xc00016ab60] Creating child policy of type "cds_experimental" for locality "cluster:cluster"  (t=+9.668347ms)
        tlogger.go:116: INFO cdsbalancer.go:109 [xds] [cds-lb 0xc00034f500] Created  (t=+9.7[208](https://github.com/grpc/grpc-go/actions/runs/10063024287/job/27816861885?pr=7430#step:8:209)16ms)
        tlogger.go:116: INFO cdsbalancer.go:121 [xds] [cds-lb 0xc00034f500] xDS credentials in use: false  (t=+9.805164ms)
        tlogger.go:116: INFO cdsbalancer.go:290 [xds] [cds-lb 0xc00034f500] Received balancer config update: {
              "LoadBalancingConfig": null,
              "Cluster": "cluster"
            }  (t=+9.869424ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc000ca6bb0,state:CONNECTING,stateToAggregate:CONNECTING]  (t=+10.04887ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to CONNECTING  (t=+10.110194ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] and known map[new_route:{}]. Diff []
        logging.go:30: open watch 2 for type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "1"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] and known map[]. Diff [cluster]
        logging.go:30: respond type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] version "" with version "1"
        tlogger.go:116: INFO cdsbalancer.go:414 [xds] [cds-lb 0xc00034f500] Received Cluster resource: {
              "ClusterType": 0,
              "ClusterName": "cluster",
              "EDSServiceName": "cluster",
              "LRSServerConfig": null,
              "SecurityCfg": null,
              "MaxRequests": null,
              "DNSHostName": "",
              "PrioritizedClusterNames": null,
              "LBPolicy": [
                {
                  "ring_hash_experimental": {
                    "minRingSize": 1024,
                    "maxRingSize": 8388608
                  }
                }
              ],
              "OutlierDetection": null,
              "Raw": {
                "type_url": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
                "value": "CgdjbHVzdGVyGg0KAhoAEgdjbHVzdGVyMAIQAw=="
              },
              "TelemetryLabels": {
                "csm.service_name": "unknown",
                "csm.service_namespace_name": "unknown"
              }
            }  (t=+12.024167ms)
        tlogger.go:116: INFO clusterresolver.go:85 [xds] [xds-cluster-resolver-lb 0xc0008ce4e0] Created  (t=+12.117442ms)
        tlogger.go:116: INFO cdsbalancer.go:455 [xds] [cds-lb 0xc00034f500] Created child policy 0xc0008ce4e0 of type cluster_resolver_experimental  (t=+12.174428ms)
        tlogger.go:116: INFO clusterresolver.go:187 [xds] [xds-cluster-resolver-lb 0xc0008ce4e0] Received new balancer config: {
              "discoveryMechanisms": [
                {
                  "cluster": "cluster",
                  "edsServiceName": "cluster",
                  "outlierDetection": {},
                  "telemetryLabels": {
                    "csm.service_name": "unknown",
                    "csm.service_namespace_name": "unknown"
                  }
                }
              ],
              "xdsLbPolicy": [
                {
                  "ring_hash_experimental": {
                    "minRingSize": 1024,
                    "maxRingSize": 8388608
                  }
                }
              ]
            }  (t=+12.57066ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] and known map[cluster:{}]. Diff []
        logging.go:30: open watch 3 for type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "1"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] and known map[]. Diff [cluster]
        logging.go:30: respond type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] version "" with version "1"
        tlogger.go:116: WARNING unmarshal_eds.go:125 [xds] [xds-resource] Ignoring locality {
              "region": "region-1",
              "zone": "zone-1",
              "subZone": "subzone-1"
            } with weight 0  (t=+13.409911ms)
        tlogger.go:116: INFO balancer.go:76 [xds] [priority-lb 0xc0000e81c0] Created  (t=+13.556706ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:CONNECTING Picker:0xc0001117e0}  (t=+14.46194ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:CONNECTING Picker:0xc0001117e0}  (t=+14.521051ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc0001117e0,state:CONNECTING,stateToAggregate:CONNECTING]  (t=+14.574571ms)
        tlogger.go:116: INFO balancergroup.go:305 [xds] [priority-lb 0xc0000e81c0] Adding child policy of type "outlier_detection_experimental" for locality "priority-0-0"  (t=+14.626268ms)
        tlogger.go:116: INFO balancergroup.go:102 [xds] [priority-lb 0xc0000e81c0] Creating child policy of type "outlier_detection_experimental" for locality "priority-0-0"  (t=+14.675109ms)
        tlogger.go:116: INFO balancer.go:74 [xds] [outlier-detection-lb 0xc0007f91e0] Created  (t=+14.727637ms)
        tlogger.go:116: INFO clusterimpl.go:84 [xds] [xds-cluster-impl-lb 0xc000174008] Created  (t=+14.796336ms)
        tlogger.go:116: INFO ringhash.go:55 [xds] [ring-hash-lb 0xc001108280] Created  (t=+14.871827ms)
        tlogger.go:116: INFO ringhash.go:271 [xds] [ring-hash-lb 0xc001108280] Received update from resolver, balancer config: {
              "minRingSize": 1024,
              "maxRingSize": 4096
            }  (t=+14.956245ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [priority-lb 0xc0000e81c0] Balancer state update from locality priority-0-0, new state: {ConnectivityState:TRANSIENT_FAILURE Picker:0xc000db6498}  (t=+15.054279ms)
        tlogger.go:116: WARNING balancer_child.go:118 [xds] [priority-lb 0xc0000e81c0] Failed to update state for child policy "priority-0-0": bad resolver state  (t=+15.097871ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:TRANSIENT_FAILURE Picker:0xc000db6498}  (t=+15.149177ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:TRANSIENT_FAILURE Picker:0xc000db6498}  (t=+15.19389ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc000db6498,state:TRANSIENT_FAILURE,stateToAggregate:TRANSIENT_FAILURE]  (t=+15.251809ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to TRANSIENT_FAILURE  (t=+15.309176ms)
        tlogger.go:116: INFO server.go:685 [core] [Server #554]Server created  (t=+15.654513ms)
        stubserver.go:266: Started test service backend at "127.0.0.1:45139"
        ringhash_balancer_test.go:1549: Updating EDS with a new backend endpoint.
        server.go:229: Created new resource snapshot...
        logging.go:30: respond open watch 1 type.googleapis.com/envoy.config.listener.v3.Listener[test.server] with new version "2"
        logging.go:30: respond type.googleapis.com/envoy.config.listener.v3.Listener[test.server] version "1" with version "2"
        logging.go:30: respond open watch 2 type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] with new version "2"
        logging.go:30: respond type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] version "1" with version "2"
        logging.go:30: respond open watch 3 type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] with new version "2"
        logging.go:30: respond type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] version "1" with version "2"
        server.go:235: Updated snapshot cache with resource snapshot...
        tlogger.go:116: INFO server.go:881 [core] [Server #554 ListenSocket #555]ListenSocket created  (t=+16.230751ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] and known map[cluster:{}]. Diff []
        logging.go:30: respond type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] version "1" with version "2"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.listener.v3.Listener[test.server] and known map[test.server:{}]. Diff []
        logging.go:30: open watch 4 for type.googleapis.com/envoy.config.listener.v3.Listener[test.server] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] and known map[new_route:{}]. Diff []
        logging.go:30: open watch 5 for type.googleapis.com/envoy.config.route.v3.RouteConfiguration[new_route] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] and known map[cluster:{}]. Diff []
        logging.go:30: open watch 6 for type.googleapis.com/envoy.config.cluster.v3.Cluster[cluster] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        tlogger.go:116: INFO balancergroup.go:375 [xds] [priority-lb 0xc0000e81c0] Removing child policy for locality "priority-0-0"  (t=+18.951905ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:CONNECTING Picker:0xc00046b550}  (t=+19.059957ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:CONNECTING Picker:0xc00046b550}  (t=+19.11468ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc00046b550,state:CONNECTING,stateToAggregate:TRANSIENT_FAILURE]  (t=+19.167128ms)
        tlogger.go:116: INFO balancergroup.go:305 [xds] [priority-lb 0xc0000e81c0] Adding child policy of type "outlier_detection_experimental" for locality "priority-0-1"  (t=+19.22193ms)
        tlogger.go:116: INFO balancergroup.go:102 [xds] [priority-lb 0xc0000e81c0] Creating child policy of type "outlier_detection_experimental" for locality "priority-0-1"  (t=+19.263979ms)
        tlogger.go:116: INFO balancer.go:74 [xds] [outlier-detection-lb 0xc0008625b0] Created  (t=+19.311668ms)
        tlogger.go:116: INFO clusterimpl.go:84 [xds] [xds-cluster-impl-lb 0xc0001758c8] Created  (t=+19.376239ms)
        tlogger.go:116: INFO ringhash.go:55 [xds] [ring-hash-lb 0xc001108a80] Created  (t=+19.431783ms)
        tlogger.go:116: INFO ringhash.go:271 [xds] [ring-hash-lb 0xc001108a80] Received update from resolver, balancer config: {
              "minRingSize": 1024,
              "maxRingSize": 4096
            }  (t=+19.511843ms)
        tlogger.go:116: INFO clientconn.go:847 [core] [Channel #548 SubChannel #556]Subchannel created  (t=+19.613073ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] and known map[cluster:{}]. Diff []
        logging.go:30: open watch 7 for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[cluster] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        tlogger.go:116: INFO balancergroup.go:544 [xds] [priority-lb 0xc0000e81c0] Balancer state update from locality priority-0-1, new state: {ConnectivityState:IDLE Picker:0xc000d3da28}  (t=+22.067637ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:IDLE Picker:0xc000d3da28}  (t=+22.148719ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:IDLE Picker:0xc000d3da28}  (t=+22.216936ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc000d3da28,state:IDLE,stateToAggregate:IDLE]  (t=+22.277089ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to IDLE  (t=+22.353271ms)
        tlogger.go:116: INFO ringhash.go:166 [xds] [ring-hash-lb 0xc001108a80] [subConn 0xc0001153e0] Executing a queued connect for subConn in state: IDLE  (t=+22.449852ms)
        tlogger.go:116: INFO clientconn.go:1207 [core] [Channel #548 SubChannel #556]Subchannel Connectivity change to CONNECTING  (t=+22.584935ms)
        tlogger.go:116: INFO ringhash.go:357 [xds] [ring-hash-lb 0xc001108a80] SubConn's effective old state was: IDLE, new state is CONNECTING  (t=+22.69[209](https://github.com/grpc/grpc-go/actions/runs/10063024287/job/27816861885?pr=7430#step:8:210)6ms)
        tlogger.go:116: INFO ringhash.go:376 [xds] [ring-hash-lb 0xc001108a80] Pushing new state CONNECTING and picker 0xc000ca8978  (t=+23.372038ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [priority-lb 0xc0000e81c0] Balancer state update from locality priority-0-1, new state: {ConnectivityState:CONNECTING Picker:0xc000ca89c0}  (t=+23.500399ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:CONNECTING Picker:0xc000ca89c0}  (t=+23.561233ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:CONNECTING Picker:0xc000ca89c0}  (t=+23.612428ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc000ca89c0,state:CONNECTING,stateToAggregate:CONNECTING]  (t=+23.664025ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to CONNECTING  (t=+23.720811ms)
        tlogger.go:116: INFO clientconn.go:1325 [core] [Channel #548 SubChannel #556]Subchannel picks a new address ":45139" to connect  (t=+23.80537ms)
        tlogger.go:116: INFO clientconn.go:1207 [core] [Channel #548 SubChannel #556]Subchannel Connectivity change to READY  (t=+24.906601ms)
        tlogger.go:116: INFO ringhash.go:357 [xds] [ring-hash-lb 0xc001108a80] SubConn's effective old state was: CONNECTING, new state is READY  (t=+25.040091ms)
        tlogger.go:116: INFO ringhash.go:376 [xds] [ring-hash-lb 0xc001108a80] Pushing new state READY and picker 0xc000d3dae8  (t=+25.711337ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [priority-lb 0xc0000e81c0] Balancer state update from locality priority-0-1, new state: {ConnectivityState:READY Picker:0xc000d3db30}  (t=+25.866027ms)
        tlogger.go:116: INFO balancergroup.go:544 [xds] [xds-cluster-manager-lb 0xc00016ab60] Balancer state update from locality cluster:cluster, new state: {ConnectivityState:READY Picker:0xc000d3db30}  (t=+25.957648ms)
        tlogger.go:116: INFO balancerstateaggregator.go:147 [xds] [xds-cluster-manager-lb 0xc00016ab60] State update from sub-balancer "cluster:cluster": {ConnectivityState:READY Picker:0xc000d3db30}  (t=+26.026177ms)
        tlogger.go:116: INFO balancerstateaggregator.go:191 [xds] [xds-cluster-manager-lb 0xc00016ab60] Child pickers: map[cluster:cluster:picker:0xc000d3db30,state:READY,stateToAggregate:READY]  (t=+26.082812ms)
        ringhash_balancer_test.go:1566: conn.GetState(): got CONNECTING, want READY
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to READY  (t=+27.04844ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #548]Channel Connectivity change to SHUTDOWN  (t=+27.138549ms)
        tlogger.go:116: INFO resolver_wrapper.go:100 [core] [Channel #548]Closing the name resolver  (t=+27.20356ms)
        tlogger.go:116: INFO balancer_wrapper.go:144 [core] [Channel #548]ccBalancerWrapper: closing  (t=+27.295783ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.listener.v3.Listener[] and known map[test.server:{}]. Diff []
        logging.go:30: open watch 8 for type.googleapis.com/envoy.config.listener.v3.Listener[] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.cluster.v3.Cluster[] and known map[cluster:{}]. Diff []
        logging.go:30: open watch 9 for type.googleapis.com/envoy.config.cluster.v3.Cluster[] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[] and known map[cluster:{}]. Diff []
        logging.go:30: open watch 10 for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment[] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        tlogger.go:116: INFO watch_service.go:59 [xds] [xds-resolver 0xc00103c240] Canceling watch on Listener resource "test.server"  (t=+27.727762ms)
        logging.go:30: nodeID "4d0a6692-9942-4553-9a52-876e89657de8" requested type.googleapis.com/envoy.config.route.v3.RouteConfiguration[] and known map[new_route:{}]. Diff []
        logging.go:30: open watch 11 for type.googleapis.com/envoy.config.route.v3.RouteConfiguration[] from nodeID "4d0a6692-9942-4553-9a52-876e89657de8", version "2"
        tlogger.go:116: INFO watch_service.go:94 [xds] [xds-resolver 0xc00103c240] Canceling watch on RouteConfiguration resource "new_route"  (t=+28.466244ms)
        tlogger.go:116: INFO ringhash.go:441 [xds] [ring-hash-lb 0xc001108280] Shutdown  (t=+28.884677ms)
        tlogger.go:116: WARNING authority.go:422 [xds] [xds-client 0xc000c363c0] [passthrough:///127.0.0.1:41641] Watchers not notified since ADS stream failed after having received at least one response: rpc error: code = Canceled desc = context canceled  (t=+29.130918ms)
        tlogger.go:116: WARNING transport.go:480 [xds] [xds-client 0xc000c363c0] [passthrough:///127.0.0.1:41641] ADS stream closed: rpc error: code = Canceled desc = context canceled  (t=+29.175331ms)
        tlogger.go:116: INFO clientconn.go:540 [core] [Channel #550]Channel Connectivity change to SHUTDOWN  (t=+29.252185ms)
        tlogger.go:116: INFO resolver_wrapper.go:100 [core] [Channel #550]Closing the name resolver  (t=+29.312979ms)
        tlogger.go:116: INFO balancer_wrapper.go:144 [core] [Channel #550]ccBalancerWrapper: closing  (t=+29.377189ms)
        tlogger.go:116: INFO clientconn.go:1207 [core] [Channel #550 SubChannel #551]Subchannel Connectivity change to SHUTDOWN  (t=+29.461857ms)
        tlogger.go:116: INFO clientconn.go:1556 [core] [Channel #550 SubChannel #551]Subchannel deleted  (t=+29.523783ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #550]Channel deleted  (t=+29.681408ms)
        tlogger.go:116: INFO clientimpl.go:100 [xds] [xds-client 0xc000c363c0] Shutdown  (t=+29.742172ms)
        tlogger.go:116: INFO xds_resolver.go:239 [xds] [xds-resolver 0xc00103c240] Shutdown  (t=+29.807294ms)
        tlogger.go:116: INFO clusterimpl.go:322 [xds] [xds-cluster-impl-lb 0xc000174008] Shutdown  (t=+29.982322ms)
        tlogger.go:116: INFO ringhash.go:441 [xds] [ring-hash-lb 0xc001108a80] Shutdown  (t=+30.072981ms)
        tlogger.go:116: INFO clusterimpl.go:322 [xds] [xds-cluster-impl-lb 0xc0001758c8] Shutdown  (t=+30.14176ms)
        tlogger.go:116: INFO balancergroup.go:375 [xds] [priority-lb 0xc0000e81c0] Removing child policy for locality "priority-0-1"  (t=+30.229164ms)
        tlogger.go:116: INFO clusterresolver.go:336 [xds] [xds-cluster-resolver-lb 0xc0008ce4e0] Shutdown  (t=+30.271413ms)
        tlogger.go:116: INFO cdsbalancer.go:380 [xds] [cds-lb 0xc00034f500] Shutdown  (t=+30.325614ms)
        tlogger.go:116: INFO clustermanager.go:192 [xds] [xds-cluster-manager-lb 0xc00016ab60] Shutdown  (t=+30.402478ms)
        tlogger.go:116: INFO clientconn.go:1207 [core] [Channel #548 SubChannel #556]Subchannel Connectivity change to SHUTDOWN  (t=+30.512895ms)
        tlogger.go:116: INFO clientconn.go:1556 [core] [Channel #548 SubChannel #556]Subchannel deleted  (t=+30.582194ms)
        tlogger.go:116: INFO clientconn.go:310 [core] [Channel #548]Channel deleted  (t=+30.733277ms)
        tlogger.go:116: INFO server.go:817 [core] [Server #554 ListenSocket #555]ListenSocket deleted  (t=+30.840969ms)
        tlogger.go:116: INFO server.go:817 [core] [Server #547 ListenSocket #549]ListenSocket deleted  (t=+31.008[212](https://github.com/grpc/grpc-go/actions/runs/10063024287/job/27816861885?pr=7430#step:8:213)ms)
FAIL
easwars commented 2 months ago

https://github.com/grpc/grpc-go/actions/runs/10639727220/job/29498247822?pr=7578

aranjans commented 2 months ago

I ran this test on forge, 8 failed out of 100k runs.

zasweq commented 1 month ago

https://github.com/grpc/grpc-go/actions/runs/10802073214/job/29963448020?pr=7614

purnesh42H commented 1 month ago

https://github.com/grpc/grpc-go/actions/runs/10824775254/job/30032572239?pr=7576

zasweq commented 1 month ago

https://github.com/zasweq/grpc-go/actions/runs/10912524962/job/30287350005

eshitachandwani commented 1 month ago

https://github.com/grpc/grpc-go/actions/runs/11047265055/job/30688166442?pr=7658

zasweq commented 1 month ago

https://github.com/zasweq/grpc-go/actions/runs/11061116597/job/30733087341 this is ring hash IDLE to READY? Same root cause perhaps? Maybe I should track separately but will put here since likely same flakiness.

zasweq commented 3 weeks ago

https://github.com/grpc/grpc-go/actions/runs/11244635192/job/31263111110 another test failure in ring hash this time on RingHash_ContinuesConnectingWithoutPicksOneSubchannelAtATime, but perhaps on same root cause. Perhaps this is a real bug?

purnesh42H commented 2 weeks ago

https://github.com/grpc/grpc-go/actions/runs/11361570171/job/31627478831?pr=7737

easwars commented 2 weeks ago

https://github.com/grpc/grpc-go/actions/runs/11371489270/job/31633677873?pr=7721

purnesh42H commented 1 week ago

https://github.com/grpc/grpc-go/actions/runs/11511557052/job/32045171168?pr=7677