Open tjungblu opened 7 months ago
It's not easy to figure out which members are on the latest revision with the existing tooling.
Does this meet your requirement?
$ ./bin/etcdutl snapshot status ~/box/open_source/etcd/data/k8s_1.21.5/db -w table
+----------+----------+------------+------------+---------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE | VERSION |
+----------+----------+------------+------------+---------+
| 30089a77 | 2805430 | 1296 | 7.5 MB | |
+----------+----------+------------+------------+---------+
To aid recovery, I would like to propose three new commands to
etcdutl
querying and manipulating the existing dataDir, in correspondence to what we currently have inetcdctl
:
You can specify all the new members when executing etcdutl restore
command. So you don't have to necessarily add/remove members on an offline db file.
good catch, let me try to restore my cluster with those commands. I'll update the docs then :)
@ahrtr do I understand the code in the etcdutl snapshot restore
correctly that it will not attempt to replay the WAL?
Just a more concrete example:
[core@master-3 ~]$ sudo ls -l /var/lib/etcd/member/snap
total 92248
-rw-r--r--. 1 root root 35459 Mar 21 16:40 0000000000000012-00000000000629bf.snap
-rw-r--r--. 1 root root 37687 Mar 22 09:13 0000000000000014-00000000000650d0.snap
-rw-r--r--. 1 root root 37687 Mar 22 09:24 0000000000000014-00000000000677e1.snap
-rw-r--r--. 1 root root 37687 Mar 25 10:03 0000000000000015-0000000000083d6c.snap
-rw-r--r--. 1 root root 39918 Mar 25 10:09 0000000000000016-000000000008647d.snap
-rw-------. 1 root root 108752896 Mar 25 10:46 db
[core@master-3 ~]$ sudo ls -l /var/lib/etcd/member/wal
total 375044
-rw-------. 1 root root 64000000 Mar 25 10:46 0.tmp
-rw-------. 1 root root 64010344 Mar 22 09:13 000000000000001f-0000000000063002.wal
-rw-------. 1 root root 64002200 Mar 22 09:23 0000000000000020-0000000000064f0e.wal
-rw-------. 1 root root 64010672 Mar 22 09:35 0000000000000021-00000000000674b2.wal
-rw-------. 1 root root 64015728 Mar 25 10:11 0000000000000022-0000000000068f8b.wal
-rw-------. 1 root root 64000000 Mar 25 10:46 0000000000000023-0000000000086917.wal
If I attempt to restore using ./etcdutl snapshot restore /var/lib/etcd/member/snap/db --data-dir /tmp/restored_datadir --skip-hash-check
, it will ignore all the data in the WAL directory?
that it will not attempt to replay the WAL?
Correct, I believe so. The etcdutl snapshot
command only read the db or v3_snapshot file.
That's a bummer, because that's exactly what you would need when you just have one last member running.
I've updated the docs to give them a bit more structure: https://github.com/etcd-io/website/pull/818 - also explaining https://github.com/kubernetes/kubernetes/issues/118501 along the way.
What would you like to be added?
We would like to reduce the MTTR on clusters that suffer from irreparable quorum loss. To examplify, two out of three members are gone for good, as in the classic story where somebody came with an axe to your datacenter and smashed the actual servers and they are not recoverable by any means. Important here is also that they are not intended to come back anytime soon, i.e. through temporary network partition scenarios.
Mind you, lost quorum also means that
etcdctl
will not work anymore, so the restoration procedure in our docs is mostly moot. It's also unlikely that there is a backup snapshot taken, which is more recent than the current state in the dataDir.While
--force-new-cluster
is an option in a three member cluster with one member left, it's a bit cumbersome to reconfigure when you run with static pods. This gets even more annoying with a five member setup, where three are gone: which of the remaining two do you choose to continue? It's not easy to figure out which members are on the latest revision with the existing tooling.To aid recovery, I would like to propose three new commands to
etcdutl
querying and manipulating the existing dataDir, in correspondence to what we currently have inetcdctl
:member list
- dumps the current membership store according to the supplied format (simple, table, json, yaml)member remove <member id(s)>
- similar to force-new-cluster [1] this is rewriting the current membership store by filtering out the supplied member id(s)member add <member name> --peer-urls <peer urls>
- adds a given member into the current membership storemember promote <member name>
- promotes a learner (from https://github.com/etcd-io/etcd/discussions/17794)Reading revisions is implemented in
etcdutl snapshot status
already.As with
etcdutl defrag
, those commands are only ever intended to be run when etcd is not running,remove
andadd
should be considered unsafe. We need to consider the impact on the current membership storage migration, but the cmdline needs to be backward compatible to both stores anyway.[1] https://github.com/etcd-io/etcd/blob/9359aef3e3dd39b7bbf57cab4b6899a238af3144/server/etcdserver/bootstrap.go#L568-L571
Why is this needed?
Currently there is no way to manipulate the cluster membership without a live cluster and quorum.