risingwavelabs / risingwave-operator

RisingWave Kubernetes Operator
https://www.risingwave.com/cloud
Apache License 2.0
75 stars 18 forks source link

Improve liveness/readiness check #643

Open arkbriar opened 5 months ago

arkbriar commented 5 months ago

Issue found in the PR https://github.com/risingwavelabs/risingwave-operator/pull/642#issuecomment-2078610498

A possible solution is

The problem is caused by the slow compute readiness, which relies on the meta to be ready first and the k8s setup the DNS. After compute connects to the meta, it then starts its server. I think the best solutions is to use a separate port for health check of compute/compactor nodes.