etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.73k stars 9.76k forks source link

ETCD Cluster not showing members #11804

Closed Argha-2890 closed 4 years ago

Argha-2890 commented 4 years ago

Hi,

I am trying to setup a etcd cluster for my Patroni setup and unable to do so. Each individual node seems to be running fine but while trying to fetch the member list I am getting information containing only the node on which I am running the command.

OS : Ubuntu 18.04 (AWS EC2)

etcd version : 3.2.17 etcdctl API version: 2

etcd config -

Node 1 :

ETCD_INITIAL_CLUSTER="etcd0=http://10.0.0.24:2380,etcd1=http://10.0.0.48:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.0.0.24:2380"
ETCD_LISTEN_PEER_URLS="http://10.0.0.24:2380"
ETCD_LISTEN_CLIENT_URLS="http://10.0.0.24:2379,http://127.0.0.1:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://10.0.0.24:2379"
ETCD_NAME="argha-etcd1"

Node 2 :

ETCD_INITIAL_CLUSTER="etcd0=http://10.0.0.24:2380,etcd1=http://10.0.0.48:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.0.0.48:2380"
ETCD_LISTEN_PEER_URLS="http://10.0.0.48:2380"
ETCD_LISTEN_CLIENT_URLS="http://10.0.0.48:2379,http://127.0.0.1:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://10.0.0.48:2379"
ETCD_NAME="argha-etcd2"

Node1 :

ubuntu@ip-10-0-0-24:~$ ETCDCTL_API=3 etcdctl member list
8e9e05c52164694d, started, argha-etcd1, http://localhost:2380, http://10.0.0.24:2379

Node 2:

ubuntu@ip-10-0-0-48:~$ ETCDCTL_API=3 etcdctl member list
8e9e05c52164694d, started, argha-etcd2, http://localhost:2380, http://10.0.0.48:2379

I am planning to include a 3rd node as recommended, but in my testing environment I am trying to do it on a two node cluster.

Appreciate any inputs to resolve this issue.

YoyinZyc commented 4 years ago

Can you verify you create a two-node cluster? For example, put a kv pair on one node and then test whether you can get it from another. As far as I know, the cluster field members is set at very beginning when you start a cluster with the input config. https://github.com/etcd-io/etcd/blob/28c47bb2f8d4a0b60bd41dc5ff61016cff1cfb84/etcdserver/server.go#L327 Before #11198, the MemberList does not serve with linearizable data. If there is no member update operation, the api will serve the members info set at the beginning and raft state machine will not get involved.

Argha-2890 commented 4 years ago

@YoyinZyc - I tried adding a kv pair on one node and fetching it from the other, and am unable to fetch the resultant kv. Could you please guide me on what is causing this issue. I have followed the steps as mentioned in most documents for creating a cluster but somehow unable to replicate the result.

Do i need to use etcdctl add member to extend on the existing cluster?

Argha-2890 commented 4 years ago

@YoyinZyc - Just an update. When I manually ran etcd by passing arguments in the command line the functionality is working as expected. However, while trying to run as a service by adding the necessary values in the /etc/default/etcd file, it is working differently. Could you please shed some light into this.

etcd Service File :

[Unit]
Description=etcd - highly-available key value store
Documentation=https://github.com/coreos/etcd
Documentation=man:etcd
After=network.target
Wants=network-online.target

[Service]
Environment=DAEMON_ARGS=
Environment=ETCD_NAME=%H
Environment=ETCD_DATA_DIR=/var/lib/etcd/default
EnvironmentFile=-/etc/default/%p
Type=notify
User=etcd
PermissionsStartOnly=true
#ExecStart=/bin/sh -c "GOMAXPROCS=$(nproc) /usr/bin/etcd $DAEMON_ARGS"
ExecStart=/usr/bin/etcd $DAEMON_ARGS
Restart=on-abnormal
#RestartSec=10s
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
Alias=etcd2.service
Argha-2890 commented 4 years ago

I got a working cluster by following the link : https://github.com/etcd-io/etcd/tree/master/contrib/systemd/etcd3-multinode

Just wanted to know how to remove this cluster once I am done with my testing.

YoyinZyc commented 4 years ago

https://github.com/etcd-io/etcd/tree/master/contrib/systemd/etcd3-multinode

I don't think we have some command to delete a cluster. If you don't need it anymore, you can just remove the data dir(*.etcd) on each node.

Argha-2890 commented 4 years ago

Got it. Thanks..

dozymoe commented 9 months ago

Hi I just want to say that the problem was because Debian starts etcd daemon as soon as it is installed, therefore the data directory is already initialized.

The setup will work if you give it a new ETCD_DATA_DIR.