juju-solutions / interface-etcd

1 stars 5 forks source link

Error: 501: All the given peers are not reachable (Tried to connect to each peer twice and failed) [0] #4

Open mbruzek opened 8 years ago

mbruzek commented 8 years ago

While working with observable-kubernetes bundle I encountered a problem with the etcd interface.

It seems the peers are not reachable and here is the relevant section in the juju log.

2016-06-01 15:42:18 INFO etcd-relation-changed ++ config-get iface
2016-06-01 15:42:18 INFO etcd-relation-changed + interface=eth0
2016-06-01 15:42:18 INFO etcd-relation-changed ++ config-get cidr
2016-06-01 15:42:18 INFO etcd-relation-changed + cidr=10.1.0.0/16
2016-06-01 15:42:18 INFO etcd-relation-changed + connection_string=http://172.31.11.210:4001,http://172.31.61.112:4001,http://172.31.18.164:4001
2016-06-01 15:42:18 INFO etcd-relation-changed + '[' '!' -f /var/run/docker-bootstrap.pid ']'
2016-06-01 15:42:18 INFO etcd-relation-changed + sleep 1
2016-06-01 15:42:19 INFO etcd-relation-changed + docker -H unix:///var/run/docker-bootstrap.sock run --net=host --rm gcr.io/google_containers/etcd:2.0.12 etcdctl -C http://172.31.11.210:4001,http://172.31.61.112:4001,http://172.31.18.164:4001 set /coreos.com/network/config '{ "Network": "10.1.0.0/16", "Backend": {"Type": "vxlan"}}'
2016-06-01 15:42:26 INFO etcd-relation-changed Error:  501: All the given peers are not reachable (Tried to connect to each peer twice and failed) [0]
2016-06-01 15:42:27 INFO etcd-relation-changed Traceback (most recent call last):
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/var/lib/juju/agents/unit-kubernetes-0/charm/hooks/etcd-relation-changed", line 19, in <module>
2016-06-01 15:42:27 INFO etcd-relation-changed     main()
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/__init__.py", line 73, in main
2016-06-01 15:42:27 INFO etcd-relation-changed     bus.dispatch()
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 421, in dispatch
2016-06-01 15:42:27 INFO etcd-relation-changed     _invoke(other_handlers)
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 404, in _invoke
2016-06-01 15:42:27 INFO etcd-relation-changed     handler.invoke()
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 280, in invoke
2016-06-01 15:42:27 INFO etcd-relation-changed     self._action(*args)
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/var/lib/juju/agents/unit-kubernetes-0/charm/reactive/flannel.py", line 43, in run_bootstrap_daemons
2016-06-01 15:42:27 INFO etcd-relation-changed     check_call(split(cmd))
2016-06-01 15:42:27 INFO etcd-relation-changed   File "/usr/lib/python3.4/subprocess.py", line 561, in check_call
2016-06-01 15:42:27 INFO etcd-relation-changed     raise CalledProcessError(retcode, cmd)
2016-06-01 15:42:27 INFO etcd-relation-changed subprocess.CalledProcessError: Command '['scripts/bootstrap_docker.sh', 'http://172.31.11.210:4001,http://172.31.61.112:4001,http://172.31.18.164:4001']' returned non-zero exit status 4
2016-06-01 15:42:27 ERROR juju.worker.uniter.operation runhook.go:107 hook "etcd-relation-changed" failed: exit status 1
2016-06-01 15:47:27 INFO juju-log etcd:3: Reactive main running for hook etcd-relation-changed
2016-06-01 15:47:27 INFO juju-log etcd:3: Initializing Leadership Layer (is leader)
2016-06-01 15:47:27 INFO juju-log etcd:3: Initializing Apt Layer
2016-06-01 15:47:27 INFO juju-log etcd:3: Invoking reactive handler: hooks/relations/etcd/requires.py:22:changed
2016-06-01 15:47:27 INFO juju-log etcd:3: Invoking reactive handler: reactive/apt.py:49:ensure_package_status
2016-06-01 15:47:27 INFO juju-log etcd:3: Invoking reactive handler: reactive/k8s.py:27:i_am_leader
2016-06-01 15:47:28 INFO juju-log etcd:3: Invoking reactive handler: reactive/tls.py:61:check_ca_status
2016-06-01 15:47:28 INFO juju-log etcd:3: Invoking reactive handler: reactive/flannel.py:35:run_bootstrap_daemons
2016-06-01 15:47:28 INFO etcd-relation-changed ++ config-get iface
2016-06-01 15:47:28 INFO etcd-relation-changed + interface=eth0
2016-06-01 15:47:28 INFO etcd-relation-changed ++ config-get cidr
2016-06-01 15:47:28 INFO etcd-relation-changed + cidr=10.1.0.0/16
2016-06-01 15:47:28 INFO etcd-relation-changed + connection_string=http://172.31.18.164:4001,http://172.31.11.210:4001,http://172.31.61.112:4001
2016-06-01 15:47:28 INFO etcd-relation-changed + '[' '!' -f /var/run/docker-bootstrap.pid ']'
2016-06-01 15:47:28 INFO etcd-relation-changed + sleep 1
2016-06-01 15:47:29 INFO etcd-relation-changed + docker -H unix:///var/run/docker-bootstrap.sock run --net=host --rm gcr.io/google_containers/etcd:2.0.12 etcdctl -C http://172.31.18.164:4001,http://172.31.11.210:4001,http://172.31.61.112:4001 set /coreos.com/network/config '{ "Network": "10.1.0.0/16", "Backend": {"Type": "vxlan"}}'
2016-06-01 15:47:36 INFO etcd-relation-changed Error:  501: All the given peers are not reachable (Tried to connect to each peer twice and failed) [0]
2016-06-01 15:47:37 INFO etcd-relation-changed Traceback (most recent call last):
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/var/lib/juju/agents/unit-kubernetes-0/charm/hooks/etcd-relation-changed", line 19, in <module>
2016-06-01 15:47:37 INFO etcd-relation-changed     main()
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/__init__.py", line 73, in main
2016-06-01 15:47:37 INFO etcd-relation-changed     bus.dispatch()
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 421, in dispatch
2016-06-01 15:47:37 INFO etcd-relation-changed     _invoke(other_handlers)
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 404, in _invoke
2016-06-01 15:47:37 INFO etcd-relation-changed     handler.invoke()
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/usr/local/lib/python3.4/dist-packages/charms/reactive/bus.py", line 280, in invoke
2016-06-01 15:47:37 INFO etcd-relation-changed     self._action(*args)
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/var/lib/juju/agents/unit-kubernetes-0/charm/reactive/flannel.py", line 43, in run_bootstrap_daemons
2016-06-01 15:47:37 INFO etcd-relation-changed     check_call(split(cmd))
2016-06-01 15:47:37 INFO etcd-relation-changed   File "/usr/lib/python3.4/subprocess.py", line 561, in check_call
2016-06-01 15:47:37 INFO etcd-relation-changed     raise CalledProcessError(retcode, cmd)
2016-06-01 15:47:37 INFO etcd-relation-changed subprocess.CalledProcessError: Command '['scripts/bootstrap_docker.sh', 'http://172.31.18.164:4001,http://172.31.11.210:4001,http://172.31.61.112:4001']' returned non-zero exit status 4
2016-06-01 15:47:37 ERROR juju.worker.uniter.operation runhook.go:107 hook "etcd-relation-changed" failed: exit status 1
mbruzek commented 8 years ago

I went to the etcd leader unit and ran cluster-health:

$ etcdctl cluster-health
member 81a538a33a77dd97 is unreachable: no available published client urls
member 90b40b8e4dac23c4 is unhealthy: got unhealthy result from http://172.31.61.112:4001
cluster is unhealthy

I see no errors in the juju log, but the /var/log/upstart/etcd.log contains errors.

etcd.zip

lazypower commented 8 years ago

@mbruzek which version of the etcd charm was this using? i assume cs:~containers/trusty/etcd-2?

mbruzek commented 8 years ago

@chuckbutler yes you are correct: cs:~containers/trusty/etcd-2 from the observable-kubernetes bundle.

lazypower commented 8 years ago

Seems related to https://github.com/juju-solutions/layer-etcd/issues/16