sensu / sensu-go

Simple. Scalable. Multi-cloud monitoring.
https://sensu.io
MIT License
1.03k stars 175 forks source link

Failed to receive lease keepalive request from gRPC stream #4395

Closed agoddard closed 2 years ago

agoddard commented 3 years ago

in 6.4.0 we've started logging the following gRPC errors in the backend w/ embedded etcd. They're logged at debug level, very verbosely (every 1-2 seconds on a cluster with no agents). Not sure if this is a new issue, or we're now just logging something we previously weren't. Confirmed this behavior doesn't existing 6.3.0 with identical config.

Aug 20 23:32:49 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:49Z"}
Aug 20 23:32:50 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:50Z"}
Aug 20 23:32:51 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:51Z"}
Aug 20 23:32:54 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:54Z"}
Aug 20 23:32:55 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:55Z"}
Aug 20 23:32:56 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:56Z"}
Aug 20 23:32:59 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:32:59Z"}
Aug 20 23:33:00 backend-1 sensu-backend[6685]: {"component":"etcd","level":"debug","caller":"v3rpc/lease.go:118","msg":"failed to receive lease keepalive request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled","time":"2021-08-20T23:33:00Z"}
agoddard commented 2 years ago

Still happens in 6.6.3

daswars commented 2 years ago

I, i have the same error every day around the hundred thousand times in my logs with version 6.6.6

Bildschirmfoto 2022-03-31 um 20 24 00
portertech commented 2 years ago

@amdprophet is there anything we can do about this?

amdprophet commented 2 years ago

@portertech I thought I had managed to reduce it recently but it ended up continuing to happen. I'm not sure if it's possible without a change to etcd.

amdprophet commented 2 years ago

This ended up being due to streams not being closed by the Etcd clientv3 lessor keepAliveOnce method. I've opened a PR against Etcd with a fix for this https://github.com/etcd-io/etcd/pull/14357.

echlebek commented 2 years ago

Moving to next milestone due to uncertainty of etcd release.

portertech commented 2 years ago

https://github.com/etcd-io/etcd/pull/14357 and https://github.com/etcd-io/etcd/pull/14361 merged/closed, can we move forward? Do we have a release to use?

amdprophet commented 2 years ago

Yes, Etcd v3.5.5 contains the fix. :)

ccressent commented 2 years ago

It seems like this can be closed, now that #4878 has been closed?

amdprophet commented 2 years ago

Indeed. Closed by #4878.