etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.84k stars 9.77k forks source link

Validation error: expected IP in URL for binding #9575

Closed YeruchamB closed 6 years ago

YeruchamB commented 6 years ago

I'm trying to bootstrap a cluster in an aws autoscaling group using dns resolution and am getting the following errors: error: etcdmain: error verifying flags, expected IP in URL for binding (https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2380). See 'etcd --help'.

I checked the DNS resolution: [root@ip-10-5-34-241 ec2-user]# dig +noall +answer SRV _etcd-server-ssl._tcp.confucius-dev.ps.idps.a.intuit.com
_etcd-server-ssl._tcp.confucius-dev.ps.idps.a.intuit.com. 60 IN SRV 0 0 2380 i-0c64aa4edda733181.confucius-dev.ps.idps.a.intuit.com. _etcd-server-ssl._tcp.confucius-dev.ps.idps.a.intuit.com. 60 IN SRV 0 0 2380 i-0db9385e6d47f097a.confucius-dev.ps.idps.a.intuit.com. _etcd-server-ssl._tcp.confucius-dev.ps.idps.a.intuit.com. 60 IN SRV 0 0 2380 i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com.

[root@ip-10-5-34-241 ec2-user]# dig +noall +answer i-0c64aa4edda733181.confucius-dev.ps.idps.a.intuit.com i-0db9385e6d47f097a.confucius-dev.ps.idps.a.intuit.com i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com i-0c64aa4edda733181.confucius-dev.ps.idps.a.intuit.com. 60 IN A 10.5.37.108 i-0db9385e6d47f097a.confucius-dev.ps.idps.a.intuit.com. 60 IN A 10.5.36.33 i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com. 60 IN A 10.5.35.203

Using etcd v3.3.1 My configuration: --name=i-0abe37ddd7539c0fd --cert-file=/var/porticor/conf/confucius.crt \ --key-file=/var/porticor/conf/confucius.key --trusted-ca-file=/var/porticor/conf/ca.crt \ --peer-client-cert-auth --peer-cert-file=/var/porticor/conf/confucius.crt \ --peer-key-file=/var/porticor/conf/confucius.key --peer-trusted-ca-file=/var/porticor/conf/ca.crt \ --listen-client-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2379,https://127.0.0.1:2379 \ --advertise-client-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2379 \ --listen-peer-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2380 \ --initial-advertise-peer-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2380 \ --initial-cluster-state=new --discovery-srv=confucius-dev.ps.idps.a.intuit.com \ --initial-cluster-token=Confucius-dev-Servers-ASGroup-UCQ4YMM3DWDY --max-txn-ops=65535 \ --heartbeat-interval=200 --election-timeout=1000

My configuration seems similar to the examples I find in your documentation and I cant figure out what’s wrong. I’d appreciate any help.

hexfusion commented 6 years ago

error: etcdmain: error verifying flags, expected IP in URL for binding (https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2380). See 'etcd --help'

@YeruchamB this error is expected as domain name is invalid for binding.

--listen-client-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2379,https://127.0.0.1:2379 --listen-peer-urls=https://i-0abe37ddd7539c0fd.confucius-dev.ps.idps.a.intuit.com:2380

Please review: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/configuration.md#--listen-peer-urls

YeruchamB commented 6 years ago

@hexfusion I'm confused. There seem to be examples in the etcd documentation of doing exactly that. https://coreos.com/etcd/docs/latest/op-guide/clustering.html#dns-discovery

hexfusion commented 6 years ago

@hexfusion I'm confused. There seem to be examples in the etcd documentation of doing exactly that. https://coreos.com/etcd/docs/latest/op-guide/clustering.html#dns-discovery

Yes, it appears the docs are not correct here, sorry about that. I will update later today unless you would like to issue a PR. The example yields the same error.

2018-04-17 07:41:43.837719 E | etcdmain: error verifying flags, expected IP in URL for binding (http://infra0.example.com:2380). See 'etcd --help'.
YeruchamB commented 6 years ago

@hexfusion Ok... So now i have a different issue. I want to be able to use a certificate that accepts a wildcard for *.confucius-dev.ps.idps.a.intuit.com but if i run etcd with the ip's in my configuration: --name=i-02daa3e84164af17a --cert-file=/var/porticor/conf/confucius.crt --key-file=/var/porticor/conf/confucius.key \ --trusted-ca-file=/var/porticor/conf/ca.crt --peer-client-cert-auth --peer-cert-file=/var/porticor/conf/confucius.crt \ --peer-key-file=/var/porticor/conf/confucius.key --peer-trusted-ca-file=/var/porticor/conf/ca.crt --listen-client-urls=https://10.5.37.59:2379,https://127.0.0.1:2379 \ --advertise-client-urls=https://10.5.37.59:2379 --listen-peer-urls=https://10.5.37.59:2380 --initial-advertise-peer-urls=https://10.5.37.59:2380 \ --initial-cluster-state=new --discovery-srv=confucius-dev.ps.idps.a.intuit.com --initial-cluster-token=Confucius-dev-Servers-ASGroup-1G5BDDWTLLA1R \ --max-txn-ops=65535 --heartbeat-interval=200 --election-timeout=1000

I get the following error: embed: rejected connection from "10.5.36.159:55444" (error "tls: \"10.5.36.159\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"])

Is there any way to get around this so that the certificate doesnt need to include the IP?

hexfusion commented 6 years ago

@hexfusion Ok... So now i have a different issue. I want to be able to use a certificate that accepts a wildcard for *.confucius-dev.ps.idps.a.intuit.com but if i run etcd with the ip's in my configuration

You don't need to change every flag to IP.

Is there any way to get around this so that the certificate doesnt need to include the IP?

I believe if you leave the rest of the flags as you had them and only change.

--listen-client-urls=https://10.5.37.59:2379,https://127.0.0.1:2379 
--listen-peer-urls=https://10.5.37.59:2380

It should work.

YeruchamB commented 6 years ago

@hexfusion Running it with the following configuration got the same rejected certificate errors. --name=i-00f1fd31ea1e8bbce --cert-file=/var/porticor/conf/confucius.crt \ --key-file=/var/porticor/conf/confucius.key --trusted-ca-file=/var/porticor/conf/ca.crt \ --peer-client-cert-auth --peer-cert-file=/var/porticor/conf/confucius.crt --peer-key-file=/var/porticor/conf/confucius.key --peer-trusted-ca-file=/var/porticor/conf/ca.crt \ --listen-client-urls=https://10.5.36.239:2379,https://127.0.0.1:2379 \ --listen-peer-urls=https://10.5.36.239:2380 \ --advertise-client-urls=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2379 \ --initial-advertise-peer-urls=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 \ --initial-cluster-state=new --discovery-srv=confucius-dev.ps.idps.a.intuit.com \ --initial-cluster-token=Confucius-dev-Servers-ASGroup-18VUDFCKU02E2 --max-txn-ops=65535 \ --heartbeat-interval=200 --election-timeout=1000

hexfusion commented 6 years ago

@YeruchamB I will need to test this later can you please attach the full startup logs for the above. I will follow up tonight.

YeruchamB commented 6 years ago

@hexfusion bear in mind, the servers are deployed in an autoscaling group which explains the large number of elections and connection refused errors. The other instances just took a bit longer to come up...

etcd args: --name=i-00f1fd31ea1e8bbce --cert-file=/var/porticor/conf/confucius.crt \ --key-file=/var/porticor/conf/confucius.key --trusted-ca-file=/var/porticor/conf/ca.crt \ --peer-client-cert-auth --peer-cert-file=/var/porticor/conf/confucius.crt \ --peer-key-file=/var/porticor/conf/confucius.key --peer-trusted-ca-file=/var/porticor/conf/ca.crt \ --listen-client-urls=https://10.5.36.239:2379,https://127.0.0.1:2379 \ --listen-peer-urls=https://10.5.36.239:2380 \ --advertise-client-urls=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2379 \ --initial-advertise-peer-urls=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 \ --initial-cluster-state=new --discovery-srv=confucius-dev.ps.idps.a.intuit.com \ --initial-cluster-token=Confucius-dev-Servers-ASGroup-18VUDFCKU02E2 --max-txn-ops=65535 \ --heartbeat-interval=200 --election-timeout=1000

ETCD: 2018-04-17 12:49:20.872672 I | etcdmain: etcd Version: 3.3.1 ETCD: 2018-04-17 12:49:20.872732 I | etcdmain: Git SHA: 28f3f26c0 ETCD: 2018-04-17 12:49:20.872741 I | etcdmain: Go Version: go1.9.4 ETCD: 2018-04-17 12:49:20.872747 I | etcdmain: Go OS/Arch: linux/amd64 ETCD: 2018-04-17 12:49:20.872754 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2 ETCD: 2018-04-17 12:49:20.872773 W | etcdmain: no data-dir provided, using default data-dir ./i-00f1fd31ea1e8bbce.etcd ETCD: 2018-04-17 12:49:20.872827 I | embed: peerTLS: cert = /var/porticor/conf/confucius.crt, key = /var/porticor/conf/confucius.key, ca = , trusted-ca = /var/porticor/conf/ca.crt, client-cert-auth = true, crl-file = ETCD: 2018-04-17 12:49:20.874508 I | embed: listening for peers on https://10.5.36.239:2380 ETCD: 2018-04-17 12:49:20.874591 I | embed: listening for client requests on 10.5.36.239:2379 ETCD: 2018-04-17 12:49:20.874664 I | embed: listening for client requests on 127.0.0.1:2379 ETCD: 2018-04-17 12:49:20.895922 N | embed: got bootstrap from DNS for etcd-server at 0=https://i-02bf013b1a48913bd.confucius-dev.ps.idps.a.intuit.com:2380 ETCD: 2018-04-17 12:49:20.895940 N | embed: got bootstrap from DNS for etcd-server at 1=https://i-0b50869e4a9690475.confucius-dev.ps.idps.a.intuit.com:2380 ETCD: 2018-04-17 12:49:20.895948 N | embed: got bootstrap from DNS for etcd-server at i-00f1fd31ea1e8bbce=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 ETCD: 2018-04-17 12:49:20.902055 I | pkg/netutil: resolving i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 to 10.5.36.239:2380 ETCD: 2018-04-17 12:49:20.902339 I | pkg/netutil: resolving i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 to 10.5.36.239:2380 ETCD: 2018-04-17 12:49:20.904558 I | etcdserver: name = i-00f1fd31ea1e8bbce ETCD: 2018-04-17 12:49:20.904578 I | etcdserver: data dir = i-00f1fd31ea1e8bbce.etcd ETCD: 2018-04-17 12:49:20.904587 I | etcdserver: member dir = i-00f1fd31ea1e8bbce.etcd/member ETCD: 2018-04-17 12:49:20.904594 I | etcdserver: heartbeat = 200ms ETCD: 2018-04-17 12:49:20.904600 I | etcdserver: election = 1000ms ETCD: 2018-04-17 12:49:20.904607 I | etcdserver: snapshot count = 100000 ETCD: 2018-04-17 12:49:20.904641 I | etcdserver: advertise client URLs = https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2379 ETCD: 2018-04-17 12:49:20.904662 I | etcdserver: initial advertise peer URLs = https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 ETCD: 2018-04-17 12:49:20.904686 I | etcdserver: initial cluster = 0=https://i-02bf013b1a48913bd.confucius-dev.ps.idps.a.intuit.com:2380,1=https://i-0b50869e4a9690475.confucius-dev.ps.idps.a.intuit.com:2380,i-00f1fd31ea1e8bbce=https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380 ETCD: 2018-04-17 12:49:20.907364 I | etcdserver: starting member 85ddb1629bd81d5d in cluster 558aaa5f67a7a4c1 ETCD: 2018-04-17 12:49:20.907409 I | raft: 85ddb1629bd81d5d became follower at term 0 ETCD: 2018-04-17 12:49:20.907428 I | raft: newRaft 85ddb1629bd81d5d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] ETCD: 2018-04-17 12:49:20.907436 I | raft: 85ddb1629bd81d5d became follower at term 1 ETCD: 2018-04-17 12:49:20.913052 W | auth: simple token is not cryptographically signed ETCD: 2018-04-17 12:49:20.917500 I | rafthttp: starting peer 46bb61fe2a176392... ETCD: 2018-04-17 12:49:20.917544 I | rafthttp: started HTTP pipelining with peer 46bb61fe2a176392 ETCD: 2018-04-17 12:49:20.920591 I | rafthttp: started streaming with peer 46bb61fe2a176392 (writer) ETCD: 2018-04-17 12:49:20.920615 I | rafthttp: started streaming with peer 46bb61fe2a176392 (writer) ETCD: 2018-04-17 12:49:20.921592 I | rafthttp: started peer 46bb61fe2a176392 ETCD: 2018-04-17 12:49:20.921633 I | rafthttp: added peer 46bb61fe2a176392 ETCD: 2018-04-17 12:49:20.921645 I | rafthttp: started streaming with peer 46bb61fe2a176392 (stream MsgApp v2 reader) ETCD: 2018-04-17 12:49:20.921655 I | rafthttp: starting peer 69700d6e2d214e91... ETCD: 2018-04-17 12:49:20.921675 I | rafthttp: started HTTP pipelining with peer 69700d6e2d214e91 ETCD: 2018-04-17 12:49:20.923933 I | rafthttp: started streaming with peer 46bb61fe2a176392 (stream Message reader) ETCD: 2018-04-17 12:49:20.926515 I | rafthttp: started streaming with peer 69700d6e2d214e91 (writer) ETCD: 2018-04-17 12:49:20.927039 I | rafthttp: started peer 69700d6e2d214e91 ETCD: 2018-04-17 12:49:20.927063 I | rafthttp: added peer 69700d6e2d214e91 ETCD: 2018-04-17 12:49:20.927087 I | etcdserver: starting server... [version: 3.3.1, cluster version: to_be_decided] ETCD: 2018-04-17 12:49:20.927301 I | rafthttp: started streaming with peer 69700d6e2d214e91 (writer) ETCD: 2018-04-17 12:49:20.927332 I | rafthttp: started streaming with peer 69700d6e2d214e91 (stream MsgApp v2 reader) ETCD: 2018-04-17 12:49:20.927550 I | rafthttp: started streaming with peer 69700d6e2d214e91 (stream Message reader) ETCD: 2018-04-17 12:49:20.930243 I | etcdserver/membership: added member 46bb61fe2a176392 [https://i-02bf013b1a48913bd.confucius-dev.ps.idps.a.intuit.com:2380] to cluster 558aaa5f67a7a4c1 ETCD: 2018-04-17 12:49:20.930387 I | etcdserver/membership: added member 69700d6e2d214e91 [https://i-0b50869e4a9690475.confucius-dev.ps.idps.a.intuit.com:2380] to cluster 558aaa5f67a7a4c1 ETCD: 2018-04-17 12:49:20.930483 I | etcdserver/membership: added member 85ddb1629bd81d5d [https://i-00f1fd31ea1e8bbce.confucius-dev.ps.idps.a.intuit.com:2380] to cluster 558aaa5f67a7a4c1 ETCD: 2018-04-17 12:49:20.931649 I | embed: ClientTLS: cert = /var/porticor/conf/confucius.crt, key = /var/porticor/conf/confucius.key, ca = , trusted-ca = /var/porticor/conf/ca.crt, client-cert-auth = false, crl-file = ETCD: 2018-04-17 12:49:21.507828 I | raft: 85ddb1629bd81d5d is starting a new election at term 1 ETCD: 2018-04-17 12:49:23.307735 I | raft: 85ddb1629bd81d5d is starting a new election at term 2 ETCD: 2018-04-17 12:49:24.907808 I | raft: 85ddb1629bd81d5d is starting a new election at term 3 ETCD: 2018-04-17 12:49:25.924453 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:25.927888 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:26.707792 I | raft: 85ddb1629bd81d5d is starting a new election at term 4 ETCD: 2018-04-17 12:49:27.907794 I | raft: 85ddb1629bd81d5d is starting a new election at term 5 ETCD: 2018-04-17 12:49:27.930134 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:49:28.907731 I | raft: 85ddb1629bd81d5d is starting a new election at term 6 ETCD: 2018-04-17 12:49:29.907800 I | raft: 85ddb1629bd81d5d is starting a new election at term 7 ETCD: 2018-04-17 12:49:30.907748 I | raft: 85ddb1629bd81d5d is starting a new election at term 8 ETCD: 2018-04-17 12:49:30.924667 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:30.928104 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:31.907771 I | raft: 85ddb1629bd81d5d is starting a new election at term 9 ETCD: 2018-04-17 12:49:33.107739 I | raft: 85ddb1629bd81d5d is starting a new election at term 10 ETCD: 2018-04-17 12:49:34.107732 I | raft: 85ddb1629bd81d5d is starting a new election at term 11 ETCD: 2018-04-17 12:49:34.930396 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:49:35.907745 I | raft: 85ddb1629bd81d5d is starting a new election at term 12 ETCD: 2018-04-17 12:49:35.924946 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:35.928354 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:37.307800 I | raft: 85ddb1629bd81d5d is starting a new election at term 13 ETCD: 2018-04-17 12:49:38.707733 I | raft: 85ddb1629bd81d5d is starting a new election at term 14 ETCD: 2018-04-17 12:49:40.307737 I | raft: 85ddb1629bd81d5d is starting a new election at term 15 ETCD: 2018-04-17 12:49:40.925145 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:40.928565 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:41.307822 I | raft: 85ddb1629bd81d5d is starting a new election at term 16 ETCD: 2018-04-17 12:49:41.930691 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:49:42.507695 I | raft: 85ddb1629bd81d5d is starting a new election at term 17 ETCD: 2018-04-17 12:49:43.907785 I | raft: 85ddb1629bd81d5d is starting a new election at term 18 ETCD: 2018-04-17 12:49:44.907784 I | raft: 85ddb1629bd81d5d is starting a new election at term 19 ETCD: 2018-04-17 12:49:45.925386 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:45.928847 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:46.307821 I | raft: 85ddb1629bd81d5d is starting a new election at term 20 ETCD: 2018-04-17 12:49:47.707708 I | raft: 85ddb1629bd81d5d is starting a new election at term 21 ETCD: 2018-04-17 12:49:48.930895 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:49:49.507749 I | raft: 85ddb1629bd81d5d is starting a new election at term 22 ETCD: 2018-04-17 12:49:50.925590 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:50.929031 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:51.107748 I | raft: 85ddb1629bd81d5d is starting a new election at term 23 ETCD: 2018-04-17 12:49:52.307748 I | raft: 85ddb1629bd81d5d is starting a new election at term 24 ETCD: 2018-04-17 12:49:54.107750 I | raft: 85ddb1629bd81d5d is starting a new election at term 25 ETCD: 2018-04-17 12:49:55.907752 I | raft: 85ddb1629bd81d5d is starting a new election at term 26 ETCD: 2018-04-17 12:49:55.925832 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:55.929206 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:49:55.931143 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:49:57.307805 I | raft: 85ddb1629bd81d5d is starting a new election at term 27 ETCD: 2018-04-17 12:49:58.507765 I | raft: 85ddb1629bd81d5d is starting a new election at term 28 ETCD: 2018-04-17 12:50:00.307750 I | raft: 85ddb1629bd81d5d is starting a new election at term 29 ETCD: 2018-04-17 12:50:00.926051 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:00.929422 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:02.107749 I | raft: 85ddb1629bd81d5d is starting a new election at term 30 ETCD: 2018-04-17 12:50:02.931319 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:03.707746 I | raft: 85ddb1629bd81d5d is starting a new election at term 31 ETCD: 2018-04-17 12:50:05.507746 I | raft: 85ddb1629bd81d5d is starting a new election at term 32 ETCD: 2018-04-17 12:50:05.926117 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:05.929631 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:06.507744 I | raft: 85ddb1629bd81d5d is starting a new election at term 33 ETCD: 2018-04-17 12:50:07.907743 I | raft: 85ddb1629bd81d5d is starting a new election at term 34 ETCD: 2018-04-17 12:50:09.107741 I | raft: 85ddb1629bd81d5d is starting a new election at term 35 ETCD: 2018-04-17 12:50:09.931504 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:10.107746 I | raft: 85ddb1629bd81d5d is starting a new election at term 36 ETCD: 2018-04-17 12:50:10.926295 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:10.929820 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:11.507755 I | raft: 85ddb1629bd81d5d is starting a new election at term 37 ETCD: 2018-04-17 12:50:13.307775 I | raft: 85ddb1629bd81d5d is starting a new election at term 38 ETCD: 2018-04-17 12:50:14.307795 I | raft: 85ddb1629bd81d5d is starting a new election at term 39 ETCD: 2018-04-17 12:50:15.707759 I | raft: 85ddb1629bd81d5d is starting a new election at term 40 ETCD: 2018-04-17 12:50:15.926526 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:15.930045 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:16.931758 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:17.107711 I | raft: 85ddb1629bd81d5d is starting a new election at term 41 ETCD: 2018-04-17 12:50:18.707801 I | raft: 85ddb1629bd81d5d is starting a new election at term 42 ETCD: 2018-04-17 12:50:20.307696 I | raft: 85ddb1629bd81d5d is starting a new election at term 43 ETCD: 2018-04-17 12:50:20.926673 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:20.930244 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:21.507752 I | raft: 85ddb1629bd81d5d is starting a new election at term 44 ETCD: 2018-04-17 12:50:23.307749 I | raft: 85ddb1629bd81d5d is starting a new election at term 45 ETCD: 2018-04-17 12:50:23.931832 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:25.107810 I | raft: 85ddb1629bd81d5d is starting a new election at term 46 ETCD: 2018-04-17 12:50:25.926885 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:25.930472 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:26.107808 I | raft: 85ddb1629bd81d5d is starting a new election at term 47 ETCD: 2018-04-17 12:50:27.107769 I | raft: 85ddb1629bd81d5d is starting a new election at term 48 ETCD: 2018-04-17 12:50:28.707837 I | raft: 85ddb1629bd81d5d is starting a new election at term 49 ETCD: 2018-04-17 12:50:29.707826 I | raft: 85ddb1629bd81d5d is starting a new election at term 50 ETCD: 2018-04-17 12:50:30.927105 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:30.930678 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:30.932131 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:31.107712 I | raft: 85ddb1629bd81d5d is starting a new election at term 51 ETCD: 2018-04-17 12:50:32.707835 I | raft: 85ddb1629bd81d5d is starting a new election at term 52 ETCD: 2018-04-17 12:50:34.507829 I | raft: 85ddb1629bd81d5d is starting a new election at term 53 ETCD: 2018-04-17 12:50:35.927333 W | rafthttp: health check for peer 46bb61fe2a176392 could not connect: dial tcp 10.5.35.233:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:35.930907 W | rafthttp: health check for peer 69700d6e2d214e91 could not connect: dial tcp 10.5.37.55:2380: getsockopt: connection refused ETCD: 2018-04-17 12:50:36.307799 I | raft: 85ddb1629bd81d5d is starting a new election at term 54 ETCD: 2018-04-17 12:50:37.507811 I | raft: 85ddb1629bd81d5d is starting a new election at term 55 ETCD: 2018-04-17 12:50:37.932275 E | etcdserver: publish error: etcdserver: request timed out ETCD: 2018-04-17 12:50:38.507821 I | raft: 85ddb1629bd81d5d is starting a new election at term 56 ETCD: 2018-04-17 12:50:40.089742 I | embed: rejected connection from "10.5.35.233:42340" (error "tls: \"10.5.35.233\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"]) ETCD: 2018-04-17 12:50:40.107744 I | raft: 85ddb1629bd81d5d is starting a new election at term 57 ETCD: 2018-04-17 12:50:40.160112 I | embed: rejected connection from "10.5.35.233:42344" (error "tls: \"10.5.35.233\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"]) ETCD: 2018-04-17 12:50:40.162723 I | embed: rejected connection from "10.5.35.233:42346" (error "tls: \"10.5.35.233\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"]) ETCD: 2018-04-17 12:50:40.177699 I | embed: rejected connection from "10.5.35.233:42356" (error "tls: \"10.5.35.233\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"]) ETCD: 2018-04-17 12:50:40.182812 I | embed: rejected connection from "10.5.35.233:42358" (error "tls: \"10.5.35.233\" does not match any of DNSNames [\".confucius-dev.ps.idps.a.intuit.com\" \"confucius-dev.ps.idps.a.intuit.com\"] (lookup confucius-dev.ps.idps.a.intuit.com on 127.0.0.1:53: no such host)", ServerName "confucius-dev.ps.idps.a.intuit.com", IPAddresses [], DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"])

hexfusion commented 6 years ago

@YeruchamB have not forgot about you I have this setup in my lab so will get some cycles on it soon.

YeruchamB commented 6 years ago

Any updates?

hexfusion commented 6 years ago

WIll have something soon been buried at w$rk, probably tonight.

hexfusion commented 6 years ago

@YeruchamB I mocked up a working example in docker-compose. https://github.com/hexfusion/etcd-compose-examples/blob/master/discovery/dns-wildcard/docker-compose.yml

I think the most important piece of the puzzle is that you need to add the IP's to the SAN like you see here.

{
    "CN": "*",
    "hosts": [
        "ps.idps.a.hexfusion.local",
        "172.16.8.219",
        "172.16.8.220",
        "172.16.8.221"
    ],
    "key": {
        "algo": "ecdsa",
        "size": 256
    },
    "names": [
        {
            "C": "US",
            "ST": "CA",
            "L": "San Francisco"
        }
    ]
}

Give that a try in my example I used both IP or domain for --initial-advertise-peer-urls and initial-advertise-client-urls without issue.

YeruchamB commented 6 years ago

@hexfusion I'm looking for a way to set my cluster up using certificates that aren't self-signed, to which I can't add the IP to the SAN field. Based on the etcd documentation and examples I see online, etcd used to allow setting initial-advertise-urls fields as domain names and what I'm trying to do should have been possible in theory. I'd like to know why this was changed and whether there exists some workaround so that I can use certificates that don't contain IP's.

hexfusion commented 6 years ago

Based on the etcd documentation and examples I see online, etcd used to allow setting initial-advertise-urls fields as domain names and what I'm trying to do should have been possible in theory. I'd like to know why this was changed and whether there exists some workaround so that I can use certificates that don't contain IP's.

@YeruchamB here is the history, it was changed from warn to error.

ref: https://github.com/coreos/etcd/issues/6336, https://github.com/coreos/etcd/issues/8136

In regards to no IP in SAN for wildcard your issue is SRV discovery. This is not a trival process and the verification of Peers eveuntually leeds to a reverse DNS lookup. I believe the IP address used in the lookup is what is eventually used for TLS authentication. If I use no IP's in config and binding to 0.0.0.0 it still will send the IP of the peer simular to below.

embed: rejected connection from "10.5.35.233:42346" (error "tls: "10.5.35.233" does not match any of DNSNames [".confucius-dev.ps.idps.a.intuit.com" "confucius-dev.ps.idps.a.intuit.com"]

While I feel this is unfortuante and I understand your use case the logic here needs more review on my end to understand as well as a review of the history of the commits. It probably is the case that because of the SRV discovery in order to validate the Peer we NEED an IP in SAN to be safe? I am willing to look into this further but it will take some time.

/cc @gyuho

hexfusion commented 6 years ago

3.2 Changelog as ref as well. So it appears if you were to use 3.1 this would only be a warning. I am not recommending this but as a "workaround" you could test. My guess is that you might see the same result.

gyuho commented 6 years ago

@YeruchamB As @hexfusion mentioned, please use IP address for listen URLs.