Closed friedrichg closed 4 months ago
Also bad for compactors as confirmed today
they also seem to be designed wrong?
the ready endpoint for store-gateway returns
Some services are not Running:
Running: 3
Starting: 1
so store-gateway is being killed because some other service isnt ready, but the other service doesnt have a probe?
+1
We should just remove it.
@friedrichg
Should we just drop livenessProbe for store-gateway and compactor then?
@nschad yes, please
Sorry for the hold-up
PR is open #502
@friedrichg
I have run all cortex components for years without liveness probes. they are bad for store-gateways because it causes re-sharding if they are under pressure. Restarting store-gateways can take a long time to recover, same for others components. We should disable all of them by default and let users enable them as needed.
Similar to https://github.com/cortexproject/cortex-helm-chart/pull/263