cockroachdb / cockroach-operator

k8s operator for CRDB
Apache License 2.0
284 stars 95 forks source link

Dashboard is showing deleted nodes #1012

Open entishman opened 11 months ago

entishman commented 11 months ago

Describe the problem During their POC, Rivian tested adding/removing nodes from a 5 node cockroachdb cluster using our Kubernete operator. They noticed that the dashboard continued to show deleted nodes when a pod was killed/re-instantiated.

To Reproduce

  1. Set up a self-hosted 5 node CockroachDB cluster using our Kubernetes operator. Ensure that you have 1 database node per Kubernetes pod. Confirm all the nodes are operational in the database console.
  2. Kill a kubernetes pod with the associated database node.
  3. Confirm that the node is down in the dashboard
  4. Let Kubernetes instantiate a new pod along with a new CockroachDB node
  5. From the dashboard, you can see that the deleted cockroachdb node is still visible
  6. The CLI no longer shows the dead node.

Expected behavior I would expect that as Kubernetes cycles through the pod lifecycle, the dashboard would remove old cockroachdb nodes from the dashboard and in addition decommission the nodes. The customer was unable to decommission the node because the node no longer exists.

Additional data / screenshots As can be seen in the attached screenshot, node n3 was removed from the cluster and is still visible in the dashboard.

Environment:

Additional context This bug was discovered during a POC and the ramifications were primarily optics. I don't think that there were any problems other than the dashboard getting cluttered with old nodes.