pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.22k stars 493 forks source link

controller manager should check the PD svc/pod status. #293

Closed gregwebs closed 5 years ago

gregwebs commented 5 years ago

My cluster has a single PD pod stuck in status Pending. If I look at the controller manager logs, I see repeats of this:

pd_member_manager.go:204] failed to sync TidbCluster: [tidb54/demo]'s status, error: Get http://demo-pd.tidb54:2379/pd/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

tidb-operator should know already from Kubernetes that there are no pods running for the service. Either because there are no endpoints or because it looks at the status of the underlying pods.

tennix commented 5 years ago

So you want to remove the annoying error logs in this case?

gregwebs commented 5 years ago

It would be great to see "service tidb54/demo-pd has no endpoints" instead. Just knowing that something cannot connect within a timeout is a very vague failure mode both for the user of the log and tidb-operator.

zhaohaidao commented 5 years ago

Is there anyone working on it? I want to have a try

tennix commented 5 years ago

@zhaohaidao Currently, no one working on it. So just go ahead. We really appreciate your help.

weekface commented 5 years ago

close in favor of #545