Open iMikeG6 opened 2 years ago
Nevermind, I didn't realize that etcd setting support extraArgs in helm value. Though, adding --auto-compaction-mode
, --auto-compaction-retention
and --quota-backend-bytes
in the documentation would be a great help as well as adding in the troubleshooting section the fix for error etcdhttp/metrics.go:79 /health error ALARM NOSPACE status-cod 503
I don't have much expertise when it comes to tweaking etcd options, but if somebody can raise a PR for this issue and back the recommendations by reputable sources then I can review the PR and help to get it over the line. Based on this I'll add the "help-wanted" label. @iMikeG6 would you be interested in contributing a PR for this? You seem to know a lot about etcd. :)
I'm not an ETCD expert, I've simply googled an found some post that talk about a similar issues. My hope is that it will help other people who'll face the same issue I had.
Is your feature request related to a problem?
Problem started to appear on one of our tenants, which started to trow error like
etcdhttp/metrics.go:79 /health error ALARM NOSPACE status-cod 503
on etcd members and etcd nodes health check constantly failed. Consequently, etcd failed to start and vcluster became unusable.Which solution do you suggest?
On the vcluster etcd stateful set, add option to setup
--quota-backend-bytes
and/or perhaps set a default value to4294967296
(4GB) that can be overwritten via helm config value as well as those two other command--auto-compaction-mode=periodic
and--auto-compaction-retention=30m
Also, add documentation in order to be able to fix the issue. Below, here's what I did on our side:
Pause the cluster
then restart statefulset vc1-etcd
Connect to etcd-0
export the following
Get the current revision number
Compact the database
Run an etcd defrag
Confirm the disk usage has been reduced
Then remove the NOSPACE alarm
Now edit the vcluster stateful set manually and add new command arg
Finally, resume cluster
Which alternative solutions exist?
None, unless editing stateful set manually then add new command arg
Additional context
Current vcluster version 0.11.1 Kubernetes 1.23.7 Vcluster distro: k8s HA