etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.78k stars 9.77k forks source link

3.2 Rc1 - Unable to connect using latest etcdctl #8008

Closed davissp14 closed 7 years ago

davissp14 commented 7 years ago

etcdctl version: 3.2.0-rc.1+git

We are running 3 member setup that's fronted with an Haproxy. We have been using LetsEncrypt for SSL which terminates at the haproxy. I built the binaries against master for testing and noticed I wasn't able to connect any longer.

When specifying --debug, I get: grpc: transport: http2Client.notifyError got notified that the client transport was broken EOF.

I have a suspicion that it may be due to this PR. https://github.com/coreos/etcd/pull/7687/files

I would note, that I can connect just fine using earlier versions of etcdctl

Thoughts?

gyuho commented 7 years ago

I can connect just fine using earlier versions of etcdctl

What earlier versions were you using?

davissp14 commented 7 years ago

etcdctl version: 3.1.5

gyuho commented 7 years ago

I have a suspicion that it may be due to this PR.

Could you provide reproducible steps?

davissp14 commented 7 years ago

I was able to track down the commit that introduced the issue.

~/Programming/compose/etcd(master)$ git rev-parse head
2951e7f6e4473b0c9877100c09d4abf90649075d

~/Programming/compose/etcd(master)$ ETCDCTL_API=3 /Users/shaun/Programming/compose/etcd/bin/etcdctl --endpoints=https://portal850-24.scale-testing2.davis.composedb.com:19736,https://portal873-25.scale-testing2.davis.composedb.com:19736 --user=root:$PASSWORD member list -w table
Error:  context deadline exceeded

~/Programming/compose/etcd(master)$ git reset --hard head^
HEAD is now at cfbc5e5c Merge pull request #7706 from gyuho/wait-apply-conf-change
~/Programming/compose/etcd(master)$ ./build
~/Programming/compose/etcd(master)$ ETCDCTL_API=3 /Users/shaun/Programming/compose/etcd/bin/etcdctl --endpoints=https://portal850-24.scale-testing2.davis.composedb.com:19736,https://portal873-25.scale-testing2.davis.composedb.com:19736 --user=root:$PASSWORD member list -w table
+------------------+---------+--------------------------------------------+--------------------------+--------------------------+
|        ID        | STATUS  |                    NAME                    |        PEER ADDRS        |       CLIENT ADDRS       |
+------------------+---------+--------------------------------------------+--------------------------+--------------------------+
| 2dd9251e62f7ea38 | started | etcd528.aws-us-east-1-memory.4.dblayer.com | http://10.107.101.3:2380 | http://10.107.101.3:2379 |
| 3fa88055e7c33125 | started | etcd509.aws-us-east-1-memory.5.dblayer.com | http://10.107.101.4:2380 | http://10.107.101.4:2379 |
| 5d8d9cc3ee1d3b92 | started | etcd61.aws-us-east-1-memory.21.dblayer.com | http://10.107.101.2:2380 | http://10.107.101.2:2379 |
+------------------+---------+--------------------------------------------+--------------------------+--------------------------+

Merge that introduced the issue: https://github.com/coreos/etcd/commit/2951e7f6e4473b0c9877100c09d4abf90649075d

The important thing to note here is that we are using LetsEncrypt for SSL that's terminated at the haproxy. There's also no need to specify certs in this setup.

davissp14 commented 7 years ago

Not exactly sure what's causing the breakage, but issuing a git revert --strategy resolve 2951e7f6e4473b0c9877100c09d4abf90649075d -m 1 from the latest master and rebuilding resolves my issue.

heyitsanthony commented 7 years ago

It's related to the way gRPC is setting up TLS. I can reproduce locally; tests and patch forthcoming.