Closed lee7ster closed 2 years ago
Hello @lee7ster, I am unable to reproduce the error with the information provided. Could you specify each command you execute and at what point you get the error mentioned in the issue, please?
Thanks @corico44 for the response I have an existing cluster with configs like
version: "2"
services:
etcd:
image: docker.io/bitnami/etcd:latest
restart: always
network_mode: host
container_name: etcd-server
environment:
- ALLOW_NONE_AUTHENTICATION=yes
- ETCD_NAME=etcd1
- ETCD_INITIAL_ADVERTISE_PEER_URLS=http://10.213.213.46:2380
- ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
- ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2181
- ETCD_ADVERTISE_CLIENT_URLS=http://10.213.213.46:2181
- ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
- ETCD_INITIAL_CLUSTER=etcd1=http://10.213.213.46:2380,etcd2=http://10.213.212.34:2380,etcd3=http://10.213.213.31:2380
- ETCD_INITIAL_CLUSTER_STATE=new
- ETCD_ELECTION_TIMEOUT=10000
to create a 3-node setup (works fine)
Then I ran:
etcdctl --endpoints=10.213.213.46:2181,10.213.213.31:2181,10.213.212.34:2181 member add etcd-new-1 --peer-urls=http://10.213.214.122:2380
To add new server 10.213.214.122
to the group, which gives me: Member 587d96a04d1b8367 added to cluster 324ec26fc918ea81
Then I try to run etcd on my new node using docker-compose up -d
with:
version: "2"
services:
etcd:
image: docker.io/bitnami/etcd:latest
restart: always
network_mode: host
container_name: etcd-server
environment:
- ALLOW_NONE_AUTHENTICATION=yes
- ETCD_NAME=etcd-new-1
- ETCD_INITIAL_ADVERTISE_PEER_URLS=http://10.213.214.122:2380
- ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
- ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2181
- ETCD_ADVERTISE_CLIENT_URLS=http://10.213.214.122:2181
- ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
- ETCD_INITIAL_CLUSTER=etcd1=http://10.213.213.46:2380,etcd2=http://10.213.212.34:2380,etcd3=http://10.213.213.31:2380,etcd-new-1=http://10.213.214.122:2380
- ETCD_INITIAL_CLUSTER_STATE=existing
- ETCD_ELECTION_TIMEOUT=10000
Then I get this message that keeps on loading:
etcd 17:15:15.26 INFO ==> ** Starting etcd setup **
etcd 17:15:15.28 INFO ==> Validating settings in ETCD_* env vars..
etcd 17:15:15.29 WARN ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
etcd 17:15:15.29 INFO ==> Initializing etcd
etcd 17:15:15.29 INFO ==> Generating etcd config file using env variables
etcd 17:15:15.31 INFO ==> There is no data from previous deployments
etcd 17:15:15.31 INFO ==> Adding new member to existing cluster
/opt/bitnami/scripts/libetcd.sh: line 427: ETCD_ACTIVE_ENDPOINTS: unbound variable
Error: etcdclient: no available endpoints
etcd 17:15:15.34 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:15:25.35 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:15:35.37 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:15:45.39 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:15:55.41 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:16:05.43 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
etcd 17:16:15.44 WARN ==> Failed to add self to cluster, keeping trying...
Error: etcdclient: no available endpoints
So I went into the docker container and saw :
I have no name!@ip-10-213-213-42:/opt/bitnami/scripts$ echo $ETCD_ACTIVE_ENDPOINTS
Is empty.
in setup_etcd_active_endpoints
I saw is_boolean_yes "$ETCD_ON_K8S" && read -r -a endpoints_array <<<"$(tr ',;' ' ' <<<"$(etcdctl_get_endpoints)")"
will it be a problem if we are not hosting K8S? We are just hosting a docker image on EC2. @corico44
I am pretty sure it is due to the issues above because I recreated a manual command to add node using docker run and I am able to add to the cluster.
Exact same docker-compose as a command line;
docker run --detach \
-p 2181:2181 \
-p 2380:2380 \
--mount type=bind,source=/data/etcd/data.tmp,destination=/data/etcd \
--name etcd-server \
gcr.io/etcd-development/etcd:v3.5.4 \
/usr/local/bin/etcd \
--name etcd_new_2c_1 \
--data-dir /data/etcd \
--election-timeout 10000 \
--listen-client-urls http://0.0.0.0:2181 \
--advertise-client-urls http://10.213.213.42:2181 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-advertise-peer-urls http://10.213.213.42:2380 \
--initial-cluster etcd1=http://10.213.213.46:2380,etcd2=http://10.213.212.34:2380,etcd3=http://10.213.213.31:2380,etcd_new_2c_1=http://10.213.213.42:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state existing \
--log-level debug \
--logger zap \
--log-outputs stderr
works:
etcdctl --endpoints=10.213.213.46:2181,10.213.213.31:2181,10.213.212.34:2181 member list
55d2343ff33da75c, started, etcd_new_2c_1, http://10.213.213.42:2380, http://10.213.213.42:2181, false
b9271412ba95f718, started, etcd1, http://10.213.213.46:2380, http://10.213.213.46:2181, false
e58885b82761ff6a, started, etcd3, http://10.213.213.31:2380, http://10.213.213.31:2181, false
eba3a758bf584f07, started, etcd2, http://10.213.212.34:2380, http://10.213.212.34:2181, false
@lee7ster The image should work in both environments (both K8S and Docker), but some clustering features may not work in docker since we have focused more on k8s support. You can contribute with any improvements you consider, so that it works correctly in both environments. Therefore, I recommend that you try to host K8S to see if it works in your case.
@lee7ster try to use this image: quay.io/coreos/etcd
. At least for me this bitnami image was not working with docker-compose
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
你好:
很抱歉我将用中文进行回复。
我对新节点进行了简单的修改:
手动指定 ETCD_ACTIVE_ENDPOINTS
将会变得正常
Manually specifying "ETCD_ACTIVE_ENDPOINTS" will become normal
version: "2" services: etcd-new: image: docker.io/bitnami/etcd:latest restart: always network_mode: host container_name: etcd-server environment: - ALLOW_NONE_AUTHENTICATION=yes - ETCD_NAME=etcd-new-1 - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://10.213.214.122:2380 - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380 - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2181 - ETCD_ADVERTISE_CLIENT_URLS=http://10.213.214.122:2181 - ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1 - ETCD_INITIAL_CLUSTER=etcd1=http://10.213.213.46:2380,etcd2=http://10.213.212.34:2380,etcd3=http://10.213.213.31:2380,etcd-new-1=http://10.213.214.122:2380 - ETCD_ACTIVE_ENDPOINTS=10.213.213.46:2380,10.213.212.34:2380,10.213.213.31:2380,10.213.214.122:2380 # 手动指定正在运行的节点 - ETCD_INITIAL_CLUSTER_STATE=existing - ETCD_ELECTION_TIMEOUT=10000
且不需要在已有节点运行添加节点命令:
> And there is no need to run the add node command on an existing node.
`not need`
```bash
member add etcd-new-1 --peer-urls=http://10.213.214.122:2380
正如之前所说的,非 K8s 环境中将无法获取到有效的节点 ETCD_ACTIVE_ENDPOINTS
As mentioned earlier, valid nodes will not be available in non-K8s environments. libetcd.sh#L394
######################## # Setup ETCD_ACTIVE_ENDPOINTS environment variable, will return the number of active endpoints , cluster size (including not active member) and the ETCD_ACTIVE_ENDPOINTS (which is also export) # Globals: # ETCD_* # Arguments: # None # Returns: # List of Numbers (active_endpoints, cluster_size, ETCD_ACTIVE_ENDPOINTS) ######################## setup_etcd_active_endpoints() { local active_endpoints=0 local -a extra_flags active_endpoints_array local -a endpoints_array=() local host port
is_boolean_yes "$ETCD_ON_K8S" && read -r -a endpoints_array <<<"$(tr ',;' ' ' <<<"$(etcdctl_get_endpoints)")"
local -r cluster_size=${#endpoints_array[@]}
read -r -a advertised_array <<<"$(tr ',;' ' ' <<<"$ETCD_ADVERTISE_CLIENT_URLS")"
host="$(parse_uri "${advertised_array[0]}" "host")"
port="$(parse_uri "${advertised_array[0]}" "port")"
if [[ $cluster_size -gt 0 ]]; then
for e in "${endpoints_array[@]}"; do
read -r -a extra_flags <<<"$(etcdctl_auth_flags)"
extra_flags+=("--endpoints=$e")
if [[ "$e" != "$host:$port" ]] && etcdctl endpoint health "${extra_flags[@]}" >/dev/null 2>&1; then
debug "$e endpoint is active"
((active_endpoints++))
active_endpoints_array+=("$e")
fi
done
ETCD_ACTIVE_ENDPOINTS=$(echo "${active_endpoints_array[*]}" | tr ' ' ',')
export ETCD_ACTIVE_ENDPOINTS
fi
echo "${active_endpoints} ${cluster_size} ${ETCD_ACTIVE_ENDPOINTS}"
}
我发现了一些有用的代码,将其简单的修改即可自动设置 `ETCD_ACTIVE_ENDPOINTS`
> I found some useful code, which can be automatically set to' ETCD_ACTIVE_ENDPOINTS 'by simply modifying it'
[libetcd.sh#L489](https://github.com/bitnami/containers/blob/cd32bcc4b33bb25547fa78842695a6597bed10ad/bitnami/etcd/3.5/debian-11/rootfs/opt/bitnami/scripts/libetcd.sh#L489)
```bash
########################
# Prints initial cluster nodes
# Globals:
# ETCD_*
# Arguments:
# None
# Returns:
# String
########################
get_initial_cluster() {
local -a endpoints_array=()
local scheme port initial_members
read -r -a endpoints_array <<<"$(tr ',;' ' ' <<<"$ETCD_INITIAL_CLUSTER")"
if [[ ${#endpoints_array[@]} -gt 0 ]] && ! grep -sqE "://" <<<"$ETCD_INITIAL_CLUSTER"; then
# This piece of code assumes this container is used on a VM environment
# where ETCD_INITIAL_CLUSTER contains a comma-separated list of hostnames,
# and recreates it as follows:
# SCHEME://NOTE_NAME:PEER_PORT
scheme="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "scheme")"
port="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "port")"
for nodePeer in "${endpoints_array[@]}"; do
initial_members+=("${nodePeer}=${scheme}://${nodePeer}:$port")
done
echo "${initial_members[*]}" | tr ' ' ','
else
# Nothing to do
echo "$ETCD_INITIAL_CLUSTER"
fi
}
这段代码是修改后的
This code is modified
######################## get_ACTIVE_ENDPOINTS() { local -a endpoints_array=() local scheme port active_endpoints read -r -a endpoints_array <<<"$(tr ',;' ' ' <<<"$ETCD_INITIAL_CLUSTER")" if [[ ${#endpoints_array[@]} -gt 0 ]]; then if ! grep -sqE "://" <<<"$ETCD_INITIAL_CLUSTER" ; then # This piece of code assumes this container is used on a VM environment # where ETCD_INITIAL_CLUSTER contains a comma-separated list of hostnames, # and recreates it as follows: # SCHEME://NOTE_NAME:PEER_PORT scheme="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "scheme")" port="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "port")" for nodePeer in "${endpoints_array[@]}"; do active_endpoints+=("${nodePeer}:$port,") done else for nodePeer in "${endpoints_array[@]}"; do active_endpoints+=("${nodePeer##*://}") done fi echo "${active_endpoints[*]}" | tr ' ' ',' else # Nothing to do echo "$ETCD_INITIAL_CLUSTER" fi }
这是运行结果
I have no name!@ed:/opt/bitnami/etcd$ get_ACTIVE_ENDPOINTS() {
local -a endpoints_array=()
local scheme port active_endpoints
read -r -a endpoints_array <<<"$(tr ',;' ' ' <<<"$ETCD_INITIAL_CLUSTER")"
if [[ ${#endpoints_array[@]} -gt 0 ]]; then
if ! grep -sqE "://" <<<"$ETCD_INITIAL_CLUSTER" ; then
# This piece of code assumes this container is used on a VM environment
# where ETCD_INITIAL_CLUSTER contains a comma-separated list of hostnames,
# and recreates it as follows:
# SCHEME://NOTE_NAME:PEER_PORT
scheme="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "scheme")"
port="$(parse_uri "$ETCD_INITIAL_ADVERTISE_PEER_URLS" "port")"
for nodePeer in "${endpoints_array[@]}"; do
active_endpoints+=("${nodePeer}:$port,")
done
else
for nodePeer in "${endpoints_array[@]}"; do
active_endpoints+=("${nodePeer##*://}")
done
fi
echo "${active_endpoints[*]}" | tr ' ' ','
else
# Nothing to do
echo "$ETCD_INITIAL_CLUSTER"
fi
}
I have no name!@ed:/opt/bitnami/etcd$ ETCD_INITIAL_CLUSTER=etcd1=http://10.213.213.46:2380,etcd2=http://10.213.212.34:2380,etcd3=http://10.213.213.31:2380,etcd-new-1=http://10.213.214.122:2380
I have no name!@ed:/opt/bitnami/etcd$ get_ACTIVE_ENDPOINTS
10.213.213.46:2380,10.213.212.34:2380,10.213.213.31:2380,10.213.214.122:2380
只需要将其赋值到 ETCD_ACTIVE_ENDPOINTS
就可以了
Hi,
Could you please change the language to English so we can properly understand what you mention?
I recently encountered the same problem @lee7ster mentioned, and the solution @zctmdc provided works out.Thank you both so much.
As @zctmdc said that simply add "ETCD_ACTIVE_ENDPOINTS" entry to docker-compose.yml of new node and set it to existing etcd node endpoints will fix this problem.
Also we don't need to explicitly specify info of new member via etcdctl like etcdctl member add etcd-new-1 --peer-urls=http://10.213.214.122:2380
.
Name and Version
bitnami/etcd:latest
What steps will reproduce the bug?
1) Create a new cluster (3 node setup) <-- works fine 2) Add a new member using etcdctl (works ok) 3) Add a new node with exact same configs and the "Existing" cluster state hangs and keeps retrying adding self to the cluster
For this setup its just simple bitnami docker images running on EC2
What is the expected behavior?
new node successfully launched so peers can initialize/add to the cluster
What do you see instead?
Adding member via etcdctl works fine
etcdctl --endpoints=10.213.213.46:2181,10.213.213.31:2181,10.213.212.34:2181 member add etcd-new-1 --peer-urls=http://10.213.214.122:2380
Member 587d96a04d1b8367 added to cluster 324ec26fc918ea81I see existing peers polling for the new node.
setup hangs on retrying adding self to cluster
I tried downgrading the image to match the SHA of peer nodes and I get a different error message instead after that:
Additional information
endpoint command works fine:
docker compose yaml on working existing nodes:
docker compose yaml on the new node I want to add
^When I change the above state to new and change the initial cluster to standalone, everything works fine
Some Docker info: