strangelove-ventures / cosmos-operator

Cosmos Operator is a kubernetes operator for managing cosmos nodes
Apache License 2.0
80 stars 19 forks source link

CosmosFullNode: Rollouts appear to be intermittently broken #328

Closed DavidNix closed 1 year ago

DavidNix commented 1 year ago

I'm 90% sure this is due to the new caching feature. (That old saying: "Caching is hard.")

Steps to reproduce.

  1. Set up a CosmosFullNode with at least 3 replicas (perhaps more).
  2. Disable readiness probes.
  3. Cause a pod update, such as creating annotations on the pod.

Result: (May be intermittent) > 1 pod terminated at once. New pods deleted (rebooted) before the new ones are ready.

Expected: Only one pod is rebooted at a time. Other pods should not be deleted until the first pod is in sync with chain tip.