automatic re-seeding sounds dangerous

mesosphere-backup / etcd-mesos

self-healing etcd on mesos!

Apache License 2.0

67 stars 19 forks source link

automatic re-seeding sounds dangerous #85

Closed ongardie closed 9 years ago

ongardie commented 9 years ago

Hey, I just stumbled across the project, and I'm concerned about the automatic re-seeding. If you can't access a majority of an etcd/raft cluster, you don't know if you've lost committed data. Rolling with it breaks the promise that etcd/raft makes to its clients, so automatically trying to recover seems like it could cause a lot of trouble. Making this the default behavior is even worse.

Do you have evidence that automating this is even necessary in practice? I like automated systems too, but I'd rather get human approval any time I'm admitting possible data loss.

cc @philips

philips commented 9 years ago

I agree any non-human controlled recovery that isn't based on perfect knowledge of the state of the infrastructure (AWS/GCE/OpenStack/etc APIs) is doomed to cause accidental data loss.

Now, etcd will protect you from cluster misconfigurations by blocking RPCs from the old cluster but it is really a rather unsafe operation for most use cases.

spacejam commented 9 years ago

Yes, this indeed breaks your safety properties when a majority of the cluster is lost, for increased availability. The trade-off is documented in the administration guide. I agree that this is undesirable for some production uses of etcd, and I may make this default to off before moving it out of alpha state.

Reasons for it to be on, at least for the time being:

increasing the code coverage some people will hit for my call to arms to try to find bugs in this
all known users are storing recomputable data (service discovery, IPAM, k8s, etc...)
all known users prefer higher availability in the event of a catastrophe
in the event of a catastrophe that would cause a majority of a cluster to die, there are likely to be far worse problems to be dealt with than some stale configuration data, and in such cases having an available write path is likely to be important in the recovery process

philips commented 9 years ago

Can you mark it clearly in the docs as a dangerous and destructive operation. I would argue that under many use cases the data is not recomputable or not easily recomputable. And even if it is many pieces of software are not regularly tested for "time travel".

spacejam commented 9 years ago

I agree that this should be more clearly documented! re: time travel https://github.com/coreos/etcd/issues/3879 :P

philips commented 9 years ago

@spacejam sure, this is part of the read semantics of etcd. Having writes time travel is way more dangerous.

spacejam commented 9 years ago

It's pretty application specific :) Feel free to submit a pr for the docs!