cockroachdb / cockroachdb-cloudformation

Quickly setup dev/test CockroachDB clusters using AWS CloudFormation and Kubernetes
https://www.cockroachlabs.com/
Apache License 2.0
12 stars 4 forks source link

If majority of cluster goes down, it won't be able to recover #9

Closed a-robinson closed 6 years ago

a-robinson commented 6 years ago

See https://github.com/cockroachdb/cockroach/pull/13580 for explanation, but we had to remove the /health HTTP health checks from our kubernetes configuration for a reason.

If we're ok with not being able to recover from something like all the VMs restarting, then the health checks can be left in, but I just want to bring it up because I don't know whether you consciously made that choice, @nstewart.

nstewart commented 6 years ago

This was intentional. During early testing k8s appeared (it could have been another issue, though) to route to nodes before they were ready, creating a weird UX when people tried to access the cluster immediately after the template was in "CREATE COMPLETE" state. I wanted to optimize for great first impressions rather than an edge case given these were nonproduction clusters.

Note: I wasn't able to reproduce that behavior consistently, but it definitely went away after I added the health checks back.

We have much better tests now so I could try again without it later, but that was the initial thinking.

a-robinson commented 6 years ago

That's fine, just double checking. Feel free to close this.