Question: Should liveness probe probe be serializable or linearizable

We have seen issues where the host running the etcd leader has had network issues, so a leader election gets triggered.

For some reason the leader election takes some time to complete, I see messages such as:

2018-05-27 00:05:44.555593 I | raft: c2245d8bae08b6a9 is starting a new election at term 12
2018-05-27 00:05:44.555643 I | raft: c2245d8bae08b6a9 became candidate at term 13
2018-05-27 00:05:44.555680 I | raft: c2245d8bae08b6a9 received MsgVoteResp from c2245d8bae08b6a9 at term 13
2018-05-27 00:05:44.555705 I | raft: c2245d8bae08b6a9 [logterm: 3, index: 219992] sent MsgVote request to 45f494573a4dfbde at term 13
2018-05-27 00:05:44.555725 I | raft: c2245d8bae08b6a9 [logterm: 3, index: 219992] sent MsgVote request to 14b74bc0053d38cc at term 13
2018-05-27 00:05:44.555746 I | raft: c2245d8bae08b6a9 [logterm: 3, index: 219992] sent MsgVote request to 536faad3d4833413 at term 13
2018-05-27 00:05:44.558090 I | raft: c2245d8bae08b6a9 received MsgVoteResp from 536faad3d4833413 at term 13
2018-05-27 00:05:44.558129 I | raft: c2245d8bae08b6a9 [quorum:3] has received 2 MsgVoteResp votes and 0 vote rejections

I'm wondering if this last message with [quorum:3] means that it needs a quorum of 3, but it is only getting 2 votes registered? - at this point etcd-operator tried to add a new member, but the failed one hasn't been removed yet.

During the time the leader elections are occurring it appears read requests are taking a long time to complete: e.g.

2018-05-27 00:05:34.886466 W | etcdserver: read-only range request "key:\"/registry/services/specs/\" range_end:\"/registry/services/specs0\" " took too long (50.108631311s) to execute

(These requests normally take a few milliseconds)

So the 2 surviving etcd pods end up being killed because their liveness probes have failed as they do a linearizable read (which I assume is slow as leader election is occurring) - So we lose the whole cluster at this point.

I'm wondering if the liveness probe would be better as a serializable get, as this more accurately reflects the status of the pod that is being checked, and not the etcd cluster as a whole - and I think this would have prevented the cluster being killed when 1 pod failed, as was the case in this instance.

However I noticed in the code where this is set that there is a specific comment about it being linearizable, so I'm wondering if I'm missing something...

https://github.com/coreos/etcd-operator/blob/v0.9.2/pkg/util/k8sutil/pod_util.go#L72-L73

coreos / etcd-operator

Question: Should liveness probe probe be serializable or linearizable #1966