grpc / grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
https://grpc.io
Apache License 2.0
21.15k stars 4.4k forks source link

Flaky Test: Test/PickFirstLeaf_StopConnectedServer_SecondServerToFirst #7868

Open purnesh42H opened 1 week ago

purnesh42H commented 1 week ago
--- FAIL: Test (3.10s)
    --- FAIL: Test/PickFirstLeaf_StopConnectedServer_SecondServerToFirst (0.17s)
        tlogger.go:116: INFO server.go:685 [core] [Server #331]Server created  (t=+1.460984ms)
        stubserver.go:288: Started test service backend at "127.0.0.1:39795"
        tlogger.go:116: INFO server.go:881 [core] [Server #331 ListenSocket #332]ListenSocket created  (t=+2.063269ms)
        tlogger.go:116: INFO server.go:685 [core] [Server #333]Server created  (t=+2.215611ms)
        stubserver.go:288: Started test service backend at "127.0.0.1:38005"
        tlogger.go:116: INFO clientconn.go:1[65](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:66)4 [core] original dial target is: "whatever:///test.server"  (t=+2.453214ms)
        tlogger.go:116: INFO server.go:881 [core] [Server #333 ListenSocket #334]ListenSocket created  (t=+2.608856ms)
        tlogger.go:116: INFO clientconn.go:314 [core] [Channel #335]Channel created  (t=+2.703702ms)
        tlogger.go:116: INFO clientconn.go:193 [core] [Channel #335]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"whatever", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/test.server", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}  (t=+2.833413ms)
        tlogger.go:116: INFO clientconn.go:194 [core] [Channel #335]Channel authority set to "test.server"  (t=+2.909595ms)
        tlogger.go:116: INFO clientconn.go:314 [core] [Channel #335]Channel exiting idle mode  (t=+3.189119ms)
        tlogger.go:116: INFO server.go:817 [core] [Server #331 ListenSocket #332]ListenSocket deleted  (t=+103.746268ms)
        tlogger.go:116: INFO resolver_wrapper.go:200 [core] [Channel #335]Resolver state updated: {
              "Addresses": [
                {
                  "Addr": "127.0.0.1:39795",
                  "ServerName": "",
                  "Attributes": null,
                  "BalancerAttributes": null,
                  "Metadata": null
                },
                {
                  "Addr": "127.0.0.1:38005",
                  "ServerName": "",
                  "Attributes": null,
                  "BalancerAttributes": null,
                  "Metadata": null
                }
              ],
              "Endpoints": [
                {
                  "Addresses": [
                    {
                      "Addr": "127.0.0.1:39795",
                      "ServerName": "",
                      "Attributes": null,
                      "BalancerAttributes": null,
                      "Metadata": null
                    }
                  ],
                  "Attributes": null
                },
                {
                  "Addresses": [
                    {
                      "Addr": "127.0.0.1:38005",
                      "ServerName": "",
                      "Attributes": null,
                      "BalancerAttributes": null,
                      "Metadata": null
                    }
                  ],
                  "Attributes": null
                }
              ],
              "ServiceConfig": null,
              "Attributes": null
            } (resolver returned new addresses)  (t=+104.399937ms)
        tlogger.go:116: INFO balancer_wrapper.go:109 [core] [Channel #335]Channel switches to new LB policy "state_storing"  (t=+104.79791ms)
        tlogger.go:116: INFO clientconn.go:544 [core] [Channel #335]Channel Connectivity change to CONNECTING  (t=+105.222231ms)
        tlogger.go:116: INFO clientconn.go:838 [core] [Channel #335 SubChannel #336]Subchannel created  (t=+105.458091ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #336]Subchannel Connectivity change to CONNECTING  (t=+105.976808ms)
        tlogger.go:116: INFO clientconn.go:1319 [core] [Channel #335 SubChannel #336]Subchannel picks a new address "127.0.0.1:39795" to connect  (t=+106.240219ms)
        tlogger.go:116: WARNING clientconn.go:1381 [core] [Channel #335 SubChannel #336]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:39795", ServerName: "test.server", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39795: connect: connection refused"  (t=+106.914386ms)
        tlogger.go:116: INFO clientconn.go:1201 [core] [Channel #335 SubChannel #336]Subchannel Connectivity change to TRANSIENT_FAILURE, last error: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39795: connect: connection refused"  (t=+107.201271ms)
        tlogger.go:116: INFO clientconn.go:838 [core] [Channel #335 SubChannel #337]Subchannel created  (t=+107.446539ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #337]Subchannel Connectivity change to CONNECTING  (t=+107.761787ms)
        tlogger.go:116: INFO clientconn.go:1319 [core] [Channel #335 SubChannel #337]Subchannel picks a new address "127.0.0.1:38005" to connect  (t=+107.9297ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #337]Subchannel Connectivity change to READY  (t=+108.860055ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #336]Subchannel Connectivity change to SHUTDOWN  (t=+109.075006ms)
        tlogger.go:116: INFO clientconn.go:1527 [core] [Channel #335 SubChannel #336]Subchannel deleted  (t=+109.244752ms)
        tlogger.go:116: INFO clientconn.go:544 [core] [Channel #335]Channel Connectivity change to READY  (t=+109.448222ms)
        tlogger.go:116: INFO server.go:817 [core] [Server #333 ListenSocket #334]ListenSocket deleted  (t=+1[66](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:67).155594ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #337]Subchannel Connectivity change to IDLE  (t=+166.711573ms)
        tlogger.go:116: INFO clientconn.go:544 [core] [Channel #335]Channel Connectivity change to IDLE  (t=+1[67](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:68).187921ms)
        pickfirstleaf_ext_test.go:700: Failed to start server: net.Listen("tcp", "127.0.0.1:39795") = listen tcp 127.0.0.1:39795: bind: address already in use
        tlogger.go:116: INFO clientconn.go:544 [core] [Channel #335]Channel Connectivity change to SHUTDOWN  (t=+1[68](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:69).877822ms)
        tlogger.go:116: INFO resolver_wrapper.go:100 [core] [Channel #335]Closing the name resolver  (t=+1[69](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:70).140923ms)
        tlogger.go:116: INFO balancer_wrapper.go:147 [core] [Channel #335]ccBalancerWrapper: closing  (t=+169.365181ms)
        tlogger.go:116: INFO clientconn.go:1199 [core] [Channel #335 SubChannel #337]Subchannel Connectivity change to SHUTDOWN  (t=+169.843233ms)
        tlogger.go:116: INFO clientconn.go:1527 [core] [Channel #335 SubChannel #337]Subchannel deleted  (t=+1[70](https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865#step:8:71).014462ms)
        tlogger.go:116: INFO clientconn.go:314 [core] [Channel #335]Channel deleted  (t=+170.217891ms)
FAIL
FAIL    google.golang.org/grpc/balancer/pickfirst/pickfirstleaf 7.381s
arjan-bal commented 1 week ago

Failure link:

The error log: https://github.com/grpc/grpc-go/actions/runs/11989223270/job/33425355899?pr=7865

pickfirstleaf_ext_test.go:700: Failed to start server: net.Listen("tcp", "127.0.0.1:39795") = listen tcp 127.0.0.1:39795: bind: address already in use

The server was shut down, it couldn't bind to the same port when it tried to start again. I'll raise a PR that uses RestartableListeners so that the listener is not closed when the server is temporarily stopped.