redpanda-data / helm-charts

Redpanda Helm Chart
http://redpanda.com
Apache License 2.0
76 stars 96 forks source link

Helm chart should only trigger restart on cluster-level config updates when needs-restart=true #595

Open hcoyote opened 1 year ago

hcoyote commented 1 year ago

What would you like to be added?

Slack thread: https://redpandadata.slack.com/archives/C01H6JRQX1S/p1689629808218159

changes to cluster-level configs should utilize the needs-restart value in rpk cluster config status to determine if a rolling restart should occur when we apply rpk cluster config set commands on cluster-level values. not every value requires a restart (there's about 30-35 that really need it, the others apply live)

Why is this needed?

Currently, we look at the checksum of the configmap to determine if we need to restart. I think this means we will be more aggressive in restarting the cluster than we need to be. If we can rely upon rpk cluster config status we can trigger fewer unnecessary restarts which can be lengthy and disruptive on larger clusters.

https://github.com/redpanda-data/helm-charts/blob/29034f55dc948f55e612b147c20a5ffa0e65be1c/charts/redpanda/templates/statefulset.yaml#L60-L61

JIRA Link: K8S-48

hcoyote commented 1 year ago

Not sure if this affects operator as well, but worth a look.

joejulian commented 1 year ago

Not sure if this affects operator as well, but worth a look.

That's one of the beauties of the new Redpanda controller. If it's fixed here, it's fixed there.

vuldin commented 12 months ago

There needs to be a needs-restart parameter as described, but I think it should also take into account other node-level changes such as external advertised listeners.

For example, today it's not possible to update external advertised listeners. This is a frequent request especially when new users are first starting out... they are surprised to see that their brokers are advertising redpanda-0 or redpanda-0.local and want to know how to change this. The only option is to uninstall and then redeploy.

node-restart could be set to true based on multiple checks:

Another approach could be to have Redpanda itself update the NEEDS-RESTART value whenever updates are detected in a broker's /etc/redpanda/redpanda.yaml file. This would mean the helm chart would only respond to NEEDS-RESTART: true and wouldn't have to keep track of any other config changes (related Redpanda issue).