Open XiuhuaRuan opened 2 months ago
I'll try to reproduce it. Thanks for the detailed instructions, @XiuhuaRuan.
/assign
I was able to duplicate this issue with 3.5.15 and the main branch. I used the following Procfile
:
etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380' --logger=zap --log-outputs=stderr --force-new-cluster
etcd2: bin/etcd --name infra2 --listen-client-urls http://127.0.0.1:22379 --advertise-client-urls http://127.0.0.1:22379 --listen-peer-urls http://127.0.0.1:22380 --initial-advertise-peer-urls http://127.0.0.1:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380' --logger=zap --log-outputs=stderr --initial-cluster-state existing
And the following script (after running make build
):
rm -rf infra*.etcd
trap 'goreman run stop-all' EXIT
goreman start &
sleep 3
bin/etcdctl --endpoints=127.0.0.1:2379 member list
bin/etcdctl --endpoints=127.0.0.1:2379 member add infra2 --peer-urls=http://127.0.0.1:22380
sleep 1
SECONDS=0
goreman run start etcd2 &
while ! bin/etcdctl --command-timeout=0.1s --endpoints=127.0.0.1:2379 member list | grep infra2; do
sleep 0.5
done
elapsed="$SECONDS"
goreman run stop-all
echo "second node joined in $elapsed seconds"
Find the files attached and remove the .txt
extension to make them work.
Thanks @ivanvc for reproducing and @XiuhuaRuan for raising. It looks like this could be related to downgrade implementation, @siyuanfoundation is this behaviour expected?
When a new member joins an existing cluster, the new member needs to check if its version is compatible with the current cluster version, and the new member's version can be lower than cluster version only when downgrade is enabled. That's the background of needing to check the /downgrade/enabled
endpoint.
On the other end, when serving /downgrade/enabled
, the server does a linearizableReadNotify
check to make sure it reads the latest downgradeInfo from the backend.
In the case of adding a new member to 1 node cluster, after etcdctl member add
, the cluster is in a leaderless state because only 1/2 member is running. So the 1 member cannot complete linearizable read, and it will retry until timeout. After which, it would default to downgradeEnabled=false
. That's why it could take 7 seconds to proceed.
@serathius this seems to be a bug fix. WDYT?
Right, the suggestion to fix makes sense, however I would like to better understand the context why it was designed like this before.
@YoyinZyc I see you wrote this code. Can you help us understand the context why it was designed like this?
The suggestion is to add the second member as a learner firstly, and promote to a voting member it later.
@YoyinZyc I see you wrote this code. Can you help us understand the context why it was designed like this?
You mean why we designed to use linearizableReadNotify? It is just like you said, we use it to get the latest downgrade status.
The context timeout issue for the second member sounds like a bug we didn't catch at the early stage.
I'm thinking about why check /downgrade/enabled
endpoint. I think member should be still able to abort after it requests and gets snapshot/entries from the leader. If the answer is that we wanted to detect that early, then maybe linearizable read should be reconsidered.
Bug report criteria
What happened?
I started the first etcd node with option --force-new-cluster. Then I started the second node to join this new cluster. But it took 7 seconds to set up this cluster.
What did you expect to happen?
The second node could join the cluster quickly.
How can we reproduce it (as minimally and precisely as possible)?
Start the first etcd node with option --force-new-cluster ./etcd --name etcd_01 --data-dir etcd01 --listen-client-urls http://192.168.31.105:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.105:12379 --initial-advertise-peer-urls http://192.168.31.105:12380 --listen-peer-urls http://192.168.31.105:12380 --initial-cluster-token etcd-cluster --initial-cluster etcd_01=http://192.168.31.105:12380 --force-new-cluster
Check the cluster member list successfully ./etcdctl --endpoints=192.168.31.105:12379 member list 7c0730f7752a5a15, started, etcd_01, http://192.168.31.105:12380, http://192.168.31.105:12379, false
Add the secode node to the cluster ./etcdctl --endpoints=192.168.31.105:12379 member add etcd_02 --peer-urls=http://192.168.31.106:12380 Member 40339f82f8313cbb added to cluster 5fb4c1b191df6a5b
ETCD_NAME="etcd_02" ETCD_INITIAL_CLUSTER="etcd_02=http://192.168.31.106:12380,etcd_01=http://192.168.31.105:12380" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.31.106:12380" ETCD_INITIAL_CLUSTER_STATE="existing"
Start the second etcd node ./etcd --name etcd_02 --data-dir etcd02 --listen-client-urls http://192.168.31.106:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.106:12379 --initial-advertise-peer-urls http://192.168.31.106:12380 --listen-peer-urls http://192.168.31.106:12380 --initial-cluster-token etcd-cluster-07311158 --initial-cluster etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380 --initial-cluster-state existing
Checking the cluster member list failed at the beginning ./etcdctl --endpoints=192.168.31.105:12379 member list {"level":"warn","ts":"2024-07-31T13:31:53.591638+0800","logger":"etcd-client","caller":"v3@v3.5.9/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000362700/192.168.31.105:12379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} Error: context deadline exceeded
After a few seconds, check the cluster member list successfully ./etcdctl --endpoints=192.168.31.105:12379 member list 40339f82f8313cbb, started, etcd_02, http://192.168.31.106:12380, http://192.168.31.106:12379, false 7c0730f7752a5a15, started, etcd_01, http://192.168.31.105:12380, http://192.168.31.105:12379, false
Anything else we need to know?
I tried etcd-v3.5.9 and etcd-v3.5.15, both got the same result, but with different log output.
With etcd-v3.5.15, the etcd log showed 7 seconds absence between 13:32:42~13:32:49 when the second node started , reported two warnings "failed to reach the peer URL" and "failed to get downgrade enabled status" at 13:32:49.
With etcd-v3.5.9, the etcd log also showed 7 seconds absence between 10:42:15~10:42:22 when the second node started , reported two warnings "failed to convert response" and "failed to get downgrade enabled status" at 10:42:22.
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output