etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.5k stars 9.74k forks source link

Secure etcd2 cluster of 3 machines unable to elect a leader, #5649

Closed parsa-ionos closed 8 years ago

parsa-ionos commented 8 years ago

-- Logs begin at Mon 2016-06-13 10:10:26 UTC, end at Mon 2016-06-13 10:10:59 UTC. --

Jun 13 10:10:43 adx-etcd2-3 systemd[1]: Starting etcd2...
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=https://52.221.233.97:2379
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_CERT_FILE=/etc/ssl/etcd/key.crt
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_CLIENT_CERT_AUTH=true
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_DISCOVERY=https://discovery.etcd.io/13bc6a9330344
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_ELECTION_TIMEOUT=1200
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.31.22.26:
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_KEY_FILE=/etc/ssl/etcd/key.key
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=https://172.31.22.26:2380
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_NAME=55963006ec754096b7dbb52f1ac7ecbd
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_PEER_CA_FILE=/etc/ssl/etcd/ca.crt
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/ssl/etcd/key.crt
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_PEER_CLIENT_CERT_AUTH=true
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/ssl/etcd/key.key
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ca.crt
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: recognized and used environment variable ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ca.crt
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: etcd Version: 2.3.1
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: Git SHA: 2b67f52
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: Go Version: go1.5.3
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: Go OS/Arch: linux/amd64
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: setting maximum number of CPUs to 2, total number of available CPUs is 2
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: peerTLS: cert = /etc/ssl/etcd/key.crt, key = /etc/ssl/etcd/key.key, ca = /etc/ssl/etcd/ca.crt, 
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: listening for peers on https://172.31.22.26:2380
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: clientTLS: cert = /etc/ssl/etcd/key.crt, key = /etc/ssl/etcd/key.key, ca = , trusted-ca = /etc/
Jun 13 10:10:45 adx-etcd2-3 etcd2[831]: listening for client requests on https://0.0.0.0:2379
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: found peer 12d78c31422b43d8 in the cluster
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: found peer b38a4b9dd11c7d7d in the cluster
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: found self 5f3b1249689ee24f in the cluster
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: found 3 needed peer(s)
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: name = 55963006ec754096b7dbb52f1ac7ecbd
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: data dir = /var/lib/etcd2
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: member dir = /var/lib/etcd2/member
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: heartbeat = 100ms
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: election = 1200ms
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: snapshot count = 10000
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: discovery URL= https://discovery.etcd.io/13bc6a9330344054c2af8b0ac1ffdf23
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: advertise client URLs = https://52.221.233.97:2379
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: initial advertise peer URLs = https://172.31.22.26:2380
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: initial cluster = 55963006ec754096b7dbb52f1ac7ecbd=https://172.31.22.26:2380
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: starting member 5f3b1249689ee24f in cluster 249ba6d847c48caf
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became follower at term 0
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: newRaft 5f3b1249689ee24f [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became follower at term 1
Jun 13 10:10:46 adx-etcd2-3 etcd2[831]: starting server... [version: 2.3.1, cluster version: to_be_decided]
Jun 13 10:10:46 adx-etcd2-3 systemd[1]: Started etcd2.
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: added member 12d78c31422b43d8 [https://172.31.16.74:2380] to cluster 249ba6d847c48caf
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: added local member 5f3b1249689ee24f [https://172.31.22.26:2380] to cluster 249ba6d847c48caf
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: added member b38a4b9dd11c7d7d [https://172.31.25.241:2380] to cluster 249ba6d847c48caf
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 1
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 2
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 2
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 2
Jun 13 10:10:47 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 2
Jun 13 10:10:48 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 2
Jun 13 10:10:48 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 3
Jun 13 10:10:48 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 3
Jun 13 10:10:48 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 3
Jun 13 10:10:48 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 3
Jun 13 10:10:50 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 3
Jun 13 10:10:50 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 4
Jun 13 10:10:50 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 4
Jun 13 10:10:50 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 4
Jun 13 10:10:50 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 4
Jun 13 10:10:52 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 4
Jun 13 10:10:52 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 5
Jun 13 10:10:52 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 5
Jun 13 10:10:52 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 5
Jun 13 10:10:52 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 5
`Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: publish error: etcdserver: request timed out`
Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 5
Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 6
Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 6
Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 6
Jun 13 10:10:54 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 6
Jun 13 10:10:56 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 6
Jun 13 10:10:56 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 7
Jun 13 10:10:56 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 7
Jun 13 10:10:58 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 8
Jun 13 10:10:58 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 8
Jun 13 10:10:58 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 8
Jun 13 10:10:58 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 8
Jun 13 10:10:59 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 8
Jun 13 10:10:59 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 9
Jun 13 10:10:59 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 9
Jun 13 10:10:59 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 9
Jun 13 10:10:59 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 9
Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 9
Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 10
Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 10
Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 10
Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 10
`Jun 13 10:11:01 adx-etcd2-3 etcd2[831]: publish error: etcdserver: request timed out`
Jun 13 10:11:02 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 10
Jun 13 10:11:02 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 11
Jun 13 10:11:02 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 11
Jun 13 10:11:02 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 11
Jun 13 10:11:02 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 11
Jun 13 10:11:04 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 11
Jun 13 10:11:04 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 12
Jun 13 10:11:04 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 12
Jun 13 10:11:04 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 12
Jun 13 10:11:04 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 12
Jun 13 10:11:05 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 12
Jun 13 10:11:05 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f became candidate at term 13
Jun 13 10:11:05 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f received vote from 5f3b1249689ee24f at term 13
Jun 13 10:11:05 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to b38a4b9dd11c7d7d at term 13
Jun 13 10:11:05 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f [logterm: 1, index: 3] sent vote request to 12d78c31422b43d8 at term 13
Jun 13 10:11:07 adx-etcd2-3 etcd2[831]: 5f3b1249689ee24f is starting a new election at term 13

Cloud config file

coreos:
  etcd2:
    # Get a token from https://discovery.etcd.io/new?size=3
    # Multi-region and multi-cloud deployments to use $public_ipv4.
    # Since we are in the same DC/AZ, we use private_ipv4 only; saves cost too.
    # Only supported on Amazon EC2, Google Compute Engine, OpenStack, 
    # Rackspace, DigitalOcean, and Vagrant.
    discovery: https://discovery.etcd.io/7b832759416daf6c8e75480d16a13ae2
    advertise-client-urls: https://$public_ipv4:2379
    initial-advertise-peer-urls: https://$private_ipv4:2380
    listen-client-urls: https://0.0.0.0:2379
    listen-peer-urls: https://$private_ipv4:2380
    cert-file: /etc/ssl/etcd/key.crt
    key-file: /etc/ssl/etcd/key.key
    client-cert-auth: true
    trusted-ca-file: /etc/ssl/etcd/ca.crt
    peer-cert-file: /etc/ssl/etcd/key.crt
    peer-key-file: /etc/ssl/etcd/key.key
    peer-ca-file: /etc/ssl/etcd/ca.crt
    peer-client-cert-auth: true
    peer-trusted-ca-file: /etc/ssl/etcd/ca.crt

$ etcd2 --version etcd Version: 2.3.1 Git SHA: 2b67f52 Go Version: go1.5.3 Go OS/Arch: linux/amd64

Without certificates, cluster forming , and election is happening very smoothly.

xiang90 commented 8 years ago

@parsa-ionos

What is the machine configuration? maybe the machine is slow and the tls handshake takes more than election timeout to finish? Try to set --election-timeout = 1200ms longer? Like 5000ms?

xiang90 commented 8 years ago

Also please make sure the keys/certs and CAs are good.

parsa-ionos commented 8 years ago

@xiang90

I tried giving election-timeout 10000ms and 5000ms both.

Still its giving same logs```

Jun 14 06:36:43 adx-etcd2-3 etcd2[827]: publish error: etcdserver: request timed out Jun 14 06:36:48 adx-etcd2-3 etcd2[827]: the connection to peer 65670fcb3670a0e7 is unhealthy Jun 14 06:36:48 adx-etcd2-3 etcd2[827]: the connection to peer d22ba70594a3bc85 is unhealthy Jun 14 06:36:53 adx-etcd2-3 etcd2[827]: 89881d2772a8418c is starting a new election at term 4 Jun 14 06:36:53 adx-etcd2-3 etcd2[827]: 89881d2772a8418c became candidate at term 5 Jun 14 06:36:53 adx-etcd2-3 etcd2[827]: 89881d2772a8418c received vote from 89881d2772a8418c at term 5 Jun 14 06:36:53 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to 65670fcb3670a0e7 at term 5 Jun 14 06:36:53 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to d22ba70594a3bc85 at term 5 Jun 14 06:37:05 adx-etcd2-3 etcd2[827]: 89881d2772a8418c is starting a new election at term 5 Jun 14 06:37:05 adx-etcd2-3 etcd2[827]: 89881d2772a8418c became candidate at term 6 Jun 14 06:37:05 adx-etcd2-3 etcd2[827]: 89881d2772a8418c received vote from 89881d2772a8418c at term 6 Jun 14 06:37:05 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to d22ba70594a3bc85 at term 6 Jun 14 06:37:05 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to 65670fcb3670a0e7 at term 6 Jun 14 06:37:08 adx-etcd2-3 etcd2[827]: publish error: etcdserver: request timed out Jun 14 06:37:16 adx-etcd2-3 etcd2[827]: 89881d2772a8418c is starting a new election at term 6 Jun 14 06:37:16 adx-etcd2-3 etcd2[827]: 89881d2772a8418c became candidate at term 7 Jun 14 06:37:16 adx-etcd2-3 etcd2[827]: 89881d2772a8418c received vote from 89881d2772a8418c at term 7 Jun 14 06:37:16 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to 65670fcb3670a0e7 at term 7 Jun 14 06:37:16 adx-etcd2-3 etcd2[827]: 89881d2772a8418c [logterm: 1, index: 3] sent vote request to d22ba70594a3bc85 at term 7 Jun 14 06:37:18 adx-etcd2-3 etcd2[827]: the connection to peer 65670fcb3670a0e7 is unhealthy Jun 14 06:37:18 adx-etcd2-3 etcd2[827]: the connection to peer d22ba70594a3bc85 is unhealthy Jun 14 06:37:27 adx-etcd2-3 etcd2[827]: 89881d2772a8418c is starting a new election at term 7 Jun 14 06:37:27 adx-etcd2-3 etcd2[827]: 89881d2772a8418c became candidate at term 8 Jun 14 06:37:27 adx-etcd2-3 etcd2[827]: 89881d2772a8418c received vote from 89881d2772a8418c at term 8

parsa-ionos commented 8 years ago

@xiang90 Using Machine : aws ec2 t2.medium CoreOS-stable-1010.5.0-hvm I have cross checked certificates, its correct.

xiang90 commented 8 years ago

Jun 14 06:36:48 adx-etcd2-3 etcd2[827]: the connection to peer 65670fcb3670a0e7 is unhealthy Jun 14 06:36:48 adx-etcd2-3 etcd2[827]: the connection to peer d22ba70594a3bc85 is unhealthy

From the log, the connection between peers were not good. I think you need to double check the connection (firewall rules? certs/keys/cas?). It does not seem like a bug at etcd side.

parsa-ionos commented 8 years ago

Certificates are good, I created using easy-rsa 2.2 version. I have not modified firewall rules.

xiang90 commented 8 years ago

@parsa-ionos To be honest, I have no idea where is the problem. I have no problem setting up a cluster with TLS on similar environment. Can you provide us a step by step cluster setup? From generating certs , machine configuration, etcd configuration and the command you use to setup etcd? So we can reliably reproduce your issue on AWS.

parsa-ionos commented 8 years ago

Sample cloud config file with certs and keys:

1-etcd2.txt 2-etcd2.txt

Command used to setup :

aws ec2 run-instances --image-id ami-d704d5b4 --region ap-southeast-1 \ --instance-type t2.medium --security-group-ids sg-abc12345 \ --key-name abc-dev-key \ --instance-initiated-shutdown-behavior stop \ --user-data fileb://<(echo "cat 1-etcd2.yaml"|gzip -c)

xiang90 commented 8 years ago

what are the rules in --security-group-ids sg-abc12345?

parsa-ionos commented 8 years ago

44435 tcp 0.0.0.0/0 ✔ 8083 tcp sg-abc12345 ✔ 5050 tcp 0.0.0.0/0 ✔ 8472 udp sg-abc12345 ✔ 44434 tcp sg-abc12345 ✔ 1338 tcp 0.0.0.0/0 ✔ 2380 tcp sg-abc12345 ✔ 8084 tcp sg-abc12345 ✔ 2888 tcp sg-abc12345 ✔ 5051 tcp 0.0.0.0/0 ✔ 22 tcp 0.0.0.0/0 ✔ 1510 udp sg-abc12345 ✔ 8070 tcp 0.0.0.0/0 ✔ 2379 tcp sg-abc12345 ✔ 3888 tcp sg-abc12345 ✔ 44433 tcp sg-abc12345 ✔ 2181 tcp sg-abc12345 ✔ 0-65535 udp sg-abc12345 ✔ 8081 tcp 0.0.0.0/0 ✔

xiang90 commented 8 years ago

@parsa-ionos Thanks. We will try to reproduce and get back to you later.

gyuho commented 8 years ago

@parsa-ionos Can you provide logs in other nodes?

parsa-ionos commented 8 years ago

@xiang90 Below are logs for all 3 nodes etcd2-1.txt etcd2-2.txt etcd2-3.txt

gyuho commented 8 years ago

@parsa-ionos Thanks for detailed information. Unfortunately, this seems like an issue in your certs.

I tried exactly the same setting with provided certs and still see the same issue (I even tried etcd master branch). If you want to verify, here's how I reproduced locally:

sudo mkdir -p /etc/ssl/etcd/

echo "-----BEGIN CERTIFICATE-----
MIIFYDCCBEigAwIBAgIBAjANBgkqhkiG9w0BAQsFADCBtjELMAkGA1UEBhMCVVMx
CzAJBgNVBAgTAkNBMRUwEwYDVQQHEwxTYW5GcmFuY2lzY28xFTATBgNVBAoTDEZv
cnQtRnVuc3RvbjEdMBsGA1UECxMUTXlPcmdhbml6YXRpb25hbFVuaXQxGDAWBgNV
BAMTD0ZvcnQtRnVuc3RvbiBDQTEQMA4GA1UEKRMHRWFzeVJTQTEhMB8GCSqGSIb3
DQEJARYSbWVAbXlob3N0Lm15ZG9tYWluMB4XDTE2MDYwODA5MjIxMFoXDTI2MDYw
NjA5MjIxMFowgawxCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJDQTEVMBMGA1UEBxMM
U2FuRnJhbmNpc2NvMRUwEwYDVQQKEwxGb3J0LUZ1bnN0b24xHTAbBgNVBAsTFE15
T3JnYW5pemF0aW9uYWxVbml0MQ4wDAYDVQQDEwVldGNkMTEQMA4GA1UEKRMHRWFz
eVJTQTEhMB8GCSqGSIb3DQEJARYSbWVAbXlob3N0Lm15ZG9tYWluMIIBIjANBgkq
hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEApS2tXPTedO+TRvg5ijcAagN+91swcpnF
8VVmZWQfidTVOwnoPTqH9v1ZrKDIArgPTeWCkj8uJnwtUb5L68GZDUdiZ8Y7mbRi
tQN8/FVleHWU1pTbxQD3hKF3ujAC/UGMTn1VX3VCX0Xn/FsLZlAPU3Wzt+wYmW92
+4oHA08KTlq3yEbfqxXX+KDZGlA3nOYeVmPOs/sc8gBhBW2MQjr3kvSdMDvVQJ7I
qFKGiMt0vAnII+AzfxRSgCsd7A1OlRxQuizO69qRC7W+QGvg8yNG3U8bd4Q2e8d6
hl0yJZ0PPciptVlYcdLG9THbm9pQae5k1t2KvTJxe5xL8jwtIrLDtQIDAQABo4IB
fzCCAXswCQYDVR0TBAIwADAtBglghkgBhvhCAQ0EIBYeRWFzeS1SU0EgR2VuZXJh
dGVkIENlcnRpZmljYXRlMB0GA1UdDgQWBBQ03Bfj6x26SCg/WoMvKPOnbEuhWDCB
6wYDVR0jBIHjMIHggBTD5AOI+eL7nCRoUCjfqoxeu05EpqGBvKSBuTCBtjELMAkG
A1UEBhMCVVMxCzAJBgNVBAgTAkNBMRUwEwYDVQQHEwxTYW5GcmFuY2lzY28xFTAT
BgNVBAoTDEZvcnQtRnVuc3RvbjEdMBsGA1UECxMUTXlPcmdhbml6YXRpb25hbFVu
aXQxGDAWBgNVBAMTD0ZvcnQtRnVuc3RvbiBDQTEQMA4GA1UEKRMHRWFzeVJTQTEh
MB8GCSqGSIb3DQEJARYSbWVAbXlob3N0Lm15ZG9tYWluggkAnsGK+ZY5lwwwEwYD
VR0lBAwwCgYIKwYBBQUHAwIwCwYDVR0PBAQDAgeAMBAGA1UdEQQJMAeCBWV0Y2Qx
MA0GCSqGSIb3DQEBCwUAA4IBAQChL2CB4bZ6BApfDaGifnMXdHzTWIjGC6gPlS0M
ku4f3ivKal8rCXX2F+9CPS9rkuCP4jVm6IgDjy/4mxYZkI1Nl7oQfQiqsqzXedAL
1/Obji1JYqmbIFaHtWsjaSg1X36H4ozMfxq01EiO4lJ0O5MiJwgdDMykYw2k3ui5
z1tNn+h4NQLmmvvOAJsBxXrAvpvnQpY2sni3ZOF07QhIbQ4VffOX7QhjLZLQiflu
yVdFyR3Jey1e9Ul1vWN0f3klH4BPOIGXk1tZ3jdZPCIPSqz16Ii+S0ni+58mKMp6
XFPcuCoGYdGTNzxnxt43mnIhgC7xfXX99jxx9QCqojivbSB8
-----END CERTIFICATE-----
" > key.crt

sudo chmod 0644 key.crt
sudo mv key.crt /etc/ssl/etcd/key.crt

echo "-----BEGIN PRIVATE KEY-----
MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQClLa1c9N5075NG
+DmKNwBqA373WzBymcXxVWZlZB+J1NU7Ceg9Oof2/VmsoMgCuA9N5YKSPy4mfC1R
vkvrwZkNR2JnxjuZtGK1A3z8VWV4dZTWlNvFAPeEoXe6MAL9QYxOfVVfdUJfRef8
WwtmUA9TdbO37BiZb3b7igcDTwpOWrfIRt+rFdf4oNkaUDec5h5WY86z+xzyAGEF
bYxCOveS9J0wO9VAnsioUoaIy3S8Ccgj4DN/FFKAKx3sDU6VHFC6LM7r2pELtb5A
a+DzI0bdTxt3hDZ7x3qGXTIlnQ89yKm1WVhx0sb1Mdub2lBp7mTW3Yq9MnF7nEvy
PC0issO1AgMBAAECggEAUZW4FTvVTMiwY9NjCEOWbsZ/RcnCqjgNrt/Rja7gbQG0
uE5yaRup4HLOghd/8ufal00PKxENyHB5KfDmKvIamJZzonIlKgwQ3Pt2FmRPlCnv
c/Vef3W0y8u9CTeBonlnxbTeICIYwFoU9W46uTQ9/akbNXLly5Nkn2VThWza2Evn
WX37LNa8dFWzmadGprt6pRkkqTv/HH3UHX6+7KaXphE128hT8vEh2ZWcW8pu/47M
I0IQNepU0eRKnjqubkkDox30hrS1myMcp7/jXXmEQXDe97Ffse+mXaQlQtJhVmvL
7FQ8vaS8NpwJlOGckwX384GOztZNc0B1HUe1BtFk2QKBgQDTGglFuX08UjzlMVBQ
YGHz/IbWWrumOl7Qb21UgCrTu3GJGMdxNQJRUJdQqh7b2pyPnuS3oN0OqMbfsCsw
Tk6NRvYDR/ZPqFkcVdvCxRmeAfbf06tJ0+yAURQ0miw0PmLCjvLChOiUHts141ko
jWkW5V5vaqa3OZEo0cADtS2dgwKBgQDITzuGJmkBC4YaOWw3acy1KE9QXi16GSr2
mIpkjpD73eU1XOwEjCHsu03mF9Cvg9KUSjhePiRz4E27N3XEDFEDtsWsnhvNe0WK
iCD2U2QKbM5F2nobDYBXya1f3z4k/+UuNx4+2mm3oIrfuwqofzoiQrilm70XK9ZG
BYKk+JzMZwKBgGJisA+e8485hMgMw7GyOfTMrMsaXnqKmcXrKLlJQqyLn86Vjd5l
Jj9foCYoI1mz+WO2WkJ65ov+fTGjmX1aAaI2gFHnKBTYES/zlAiic94AbF7E9//g
lUj4gMJDWHbA3KZwmROvffYKq3/iBZuwzFmvbOPggPLEEbNEjokr16ahAoGBAIXm
ORMO4AhbzLQBCK7uIXJD1OkTW3EQG+ElzPU1PAOxrAOE7xyHFDJsbsFN0ClThFOu
XYnaHoH7QdrRUv7PgORrrC4crtcn/S+Cmg4vZmN1olmdlxw4ZT/JyASbda5EBM5k
9+nqFNs0DUFLhe9mxNScJ1RFDBzOJ/k8u71Nl3snAoGAdWL4Us/pLNcKdrqn+RnZ
YvxAocHtF5/N+RMnuCCB3IpJ9VjLIzvDZiabB5xNAQp+qzjZpy/a2KHi7U0cbuWM
jFJNx4rev3uVlwukD1yQ+lTfhKyHQl8hPqRphfE3MiLVGJlEl5jaIFcgDNn37QvS
edlLJvdESPsjnIolGn0FqMQ=
-----END PRIVATE KEY-----
" > key.key

sudo chmod 0644 key.key
sudo mv key.key /etc/ssl/etcd/key.key

echo "-----BEGIN CERTIFICATE-----
MIIFEjCCA/qgAwIBAgIJAJ7BivmWOZcMMA0GCSqGSIb3DQEBCwUAMIG2MQswCQYD
VQQGEwJVUzELMAkGA1UECBMCQ0ExFTATBgNVBAcTDFNhbkZyYW5jaXNjbzEVMBMG
A1UEChMMRm9ydC1GdW5zdG9uMR0wGwYDVQQLExRNeU9yZ2FuaXphdGlvbmFsVW5p
dDEYMBYGA1UEAxMPRm9ydC1GdW5zdG9uIENBMRAwDgYDVQQpEwdFYXN5UlNBMSEw
HwYJKoZIhvcNAQkBFhJtZUBteWhvc3QubXlkb21haW4wHhcNMTYwNjA4MDkxNDIx
WhcNMjYwNjA2MDkxNDIxWjCBtjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRUw
EwYDVQQHEwxTYW5GcmFuY2lzY28xFTATBgNVBAoTDEZvcnQtRnVuc3RvbjEdMBsG
A1UECxMUTXlPcmdhbml6YXRpb25hbFVuaXQxGDAWBgNVBAMTD0ZvcnQtRnVuc3Rv
biBDQTEQMA4GA1UEKRMHRWFzeVJTQTEhMB8GCSqGSIb3DQEJARYSbWVAbXlob3N0
Lm15ZG9tYWluMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0kB+Y0mG
V0/5Cesl6wpSjzTow/yROt7JXDbBVkTCooqSWJq3sOdKhKYIn3V6gSfYppLkBVpV
V2gAC+wjuJxqx8homYRCiNBHKL0ZP9rARyKq+TSVNX5En+SpBQ+lkmel/LOpMlCN
fmzhAk1G6NUAhAppNZuaMQe9GtSqzNN/pvxkfTrNta6qj/3nuiz5fFKUSx2n9Q/t
epI/MnOZOmQoJH8wAclo7jfVnSHtmi/CWS1QpHJr6h8lQ362q7Nj6dpIvSlt0MuR
/Wfsfsf0Fjb9AJ8AOlMy6AIJkWUsKMPAgSevN9Uvj3wAHThqjNqvBWRPMpd75p8l
rVkmAgHlbpYTzwIDAQABo4IBHzCCARswHQYDVR0OBBYEFMPkA4j54vucJGhQKN+q
jF67TkSmMIHrBgNVHSMEgeMwgeCAFMPkA4j54vucJGhQKN+qjF67TkSmoYG8pIG5
MIG2MQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0ExFTATBgNVBAcTDFNhbkZyYW5j
aXNjbzEVMBMGA1UEChMMRm9ydC1GdW5zdG9uMR0wGwYDVQQLExRNeU9yZ2FuaXph
dGlvbmFsVW5pdDEYMBYGA1UEAxMPRm9ydC1GdW5zdG9uIENBMRAwDgYDVQQpEwdF
YXN5UlNBMSEwHwYJKoZIhvcNAQkBFhJtZUBteWhvc3QubXlkb21haW6CCQCewYr5
ljmXDDAMBgNVHRMEBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IBAQDHGmFFi3Gm2/m1
sQJzRZG2Bti9bpjVWu2dTSgKlrtyLwg8NG1vbmRctZ3m2Uutrbhu9ycbp8ajDCJv
hRTzyYnlpNOrix+Uy4w9I5f2LLaYxJVuDK5bXCY/D0UIaRl2AOLTdiOZALt1/tjz
rRe212ka3fgVyOEwu3/n6k/w2lJGLGg0TCcTUsctC0K06FmSPihOxzlZSLOKia21
7db4bnYpwms2K96aDuU29zaEuZHQMjibLb1blNaWxH7YeMu5gZsfW65KeXwZlQiP
MjB1jBm6KsBbLQuRWoOGTSkUnm4eo4/e2oMZzDOAgjcs49wrPnG2X+qNwI1ANiiZ
PkKgICsM
-----END CERTIFICATE-----
" > ca.crt

sudo chmod 0644 ca.crt
sudo mv ca.crt /etc/ssl/etcd/ca.crt

./bin/etcd --name infra1 --listen-client-urls https://localhost:2379 --advertise-client-urls https://localhost:2379 --listen-peer-urls https://localhost:2380 --initial-advertise-peer-urls https://localhost:2380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/etc/ssl/etcd/key.crt \
    --key-file=/etc/ssl/etcd/key.key \
    --client-cert-auth=true \
    --trusted-ca-file=/etc/ssl/etcd/ca.crt \
    --peer-cert-file=/etc/ssl/etcd/key.crt \
    --peer-key-file=/etc/ssl/etcd/key.key \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt

./bin/etcd --name infra2 --listen-client-urls https://localhost:12379 --advertise-client-urls https://localhost:12379 --listen-peer-urls https://localhost:12380 --initial-advertise-peer-urls https://localhost:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/etc/ssl/etcd/key.crt \
    --key-file=/etc/ssl/etcd/key.key \
    --client-cert-auth=true \
    --trusted-ca-file=/etc/ssl/etcd/ca.crt \
    --peer-cert-file=/etc/ssl/etcd/key.crt \
    --peer-key-file=/etc/ssl/etcd/key.key \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt

./bin/etcd --name infra3 --listen-client-urls https://localhost:22379 --advertise-client-urls https://localhost:22379 --listen-peer-urls https://localhost:22380 --initial-advertise-peer-urls https://localhost:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/etc/ssl/etcd/key.crt \
    --key-file=/etc/ssl/etcd/key.key \
    --client-cert-auth=true \
    --trusted-ca-file=/etc/ssl/etcd/ca.crt \
    --peer-cert-file=/etc/ssl/etcd/key.crt \
    --peer-key-file=/etc/ssl/etcd/key.key \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt

We have tls-setup example here https://github.com/coreos/etcd/tree/master/hack/tls-setup. When I use the certs from this example, it works fine. Example commands are:

./bin/etcd --name infra1 --listen-client-urls https://localhost:2379 --advertise-client-urls https://localhost:2379 --listen-peer-urls https://localhost:2380 --initial-advertise-peer-urls https://localhost:2380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/home/gyuho/certs/etcd1.pem \
    --key-file=/home/gyuho/certs/etcd1-key.pem \
    --client-cert-auth=true \
    --trusted-ca-file=/home/gyuho/certs/ca.pem \
    --peer-cert-file=/home/gyuho/certs/etcd1.pem \
    --peer-key-file=/home/gyuho/certs/etcd1-key.pem \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/home/gyuho/certs/ca.pem

./bin/etcd --name infra2 --listen-client-urls https://localhost:12379 --advertise-client-urls https://localhost:12379 --listen-peer-urls https://localhost:12380 --initial-advertise-peer-urls https://localhost:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/home/gyuho/certs/etcd1.pem \
    --key-file=/home/gyuho/certs/etcd1-key.pem \
    --client-cert-auth=true \
    --trusted-ca-file=/home/gyuho/certs/ca.pem \
    --peer-cert-file=/home/gyuho/certs/etcd1.pem \
    --peer-key-file=/home/gyuho/certs/etcd1-key.pem \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/home/gyuho/certs/ca.pem

./bin/etcd --name infra3 --listen-client-urls https://localhost:22379 --advertise-client-urls https://localhost:22379 --listen-peer-urls https://localhost:22380 --initial-advertise-peer-urls https://localhost:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=https://localhost:2380,infra2=https://localhost:12380,infra3=https://localhost:22380' --initial-cluster-state new \
    --cert-file=/home/gyuho/certs/etcd1.pem \
    --key-file=/home/gyuho/certs/etcd1-key.pem \
    --client-cert-auth=true \
    --trusted-ca-file=/home/gyuho/certs/ca.pem \
    --peer-cert-file=/home/gyuho/certs/etcd1.pem \
    --peer-key-file=/home/gyuho/certs/etcd1-key.pem \
    --peer-client-cert-auth=true \
    --peer-trusted-ca-file=/home/gyuho/certs/ca.pem

Can you try https://github.com/coreos/etcd/tree/master/hack/tls-setup? I don't think this is AWS or network specific problem.

Thanks.

parsa-ionos commented 8 years ago

Hi @gyuho , I enabled debugging in etcd2 configuration parameters. Now, I am getting this error also, with all same previous logs, i gave

x509: cannot validate certificate for 172.17.8.103 because it doesn't contain any IP SANs)

I guess you are correct, problem is due to certificates, I am using certificates generated by easyrsa V:2.2.1 tool.

I have already automation setup for easyrsa, Is there a way around with easyrsa certificates ?

Using cfssl also needs hostIp in advance, But In my setup hostip not known before setup.

Thanks Sumit

gyuho commented 8 years ago

Yeah if you do self-cert, you need to know IP for cfssl.

Sorry I am not familiar with easyrsa... Can you ask easyrsa team?

Closing this since it's not etcd issue.

It would be awesome if you can contribute to our tls-setup example with easyrsa when you figure that out.

Thanks.