flannel-io / flannel

flannel is a network fabric for containers, designed for Kubernetes
Apache License 2.0
8.81k stars 2.87k forks source link

flannel keeps timing out on startup #377

Closed esecules closed 8 years ago

esecules commented 8 years ago

CoreOS 675.0.0 Flannel 0.4.0

Im trying to get kubernetes set up on coreos, and I have gotten the master and 2 workers to stand up fully, but the third worker is failing to start flanneld. This is also the first node I have set up as an etcd proxy.

flanneld.service holdoff time over, scheduling restart.
Nov 25 22:17:04 kube1-node3 systemd[1]: Starting Network fabric for containers...
Nov 25 22:18:34 kube1-node3 systemd[1]: flanneld.service start operation timed out. Terminating.
Nov 25 22:18:34 kube1-node3 systemd[1]: flanneld.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 25 22:18:34 kube1-node3 systemd[1]: Failed to start Network fabric for containers.
Nov 25 22:18:34 kube1-node3 systemd[1]: Unit flanneld.service entered failed state.
Nov 25 22:18:34 kube1-node3 systemd[1]: flanneld.service failed.

I try to run the ExecStart command manually, and i get this output

$ /usr/libexec/sdnotify-proxy /run/flannel/sd.sock /usr/bin/docker run --net=host --privileged=true --rm --volume=/run/flannel:/run/flannel --env=NOTIFY_SOCKET=/run/flannel/sd.sock --env=AWS_ACCESS_KEY_ID= --env=AWS_SECRET_ACCESS_KEY= --env-file=/run/flannel/options.env --volume=/usr/share/ca-certificates:/etc/ssl/certs:ro --volume=/etc/ssl/etcd:/etc/ssl/etcd:ro quay.io/coreos/flannel:0.4.0 /opt/bin/flanneld --ip-masq=true
NOTIFY_SOCKET environment variable not set

Also early-docker seems to be wedged somewhere, but I ant figure out why. I try to use docker ps to see whether the flannel container is running or the image has been pulled, but the command never returns. Although the early-docker logs suggest that it is running the flannel container at the same time at the service restarts.

-- Logs begin at Wed 2015-11-25 20:52:53 UTC. --
Nov 25 22:15:28 kube1-node3 systemd[1]: Started Early Docker Application Container Engine.
Nov 25 22:15:28 kube1-node3 systemd[1]: Starting Early Docker Application Container Engine...
Nov 25 22:15:28 kube1-node3 dockerd[789]: time="2015-11-25T22:15:28Z" level=info msg="+job serveapi(fd://)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:28Z" level=info msg="Listening for HTTP on fd ()"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="Loading containers: start."
Nov 25 22:15:29 kube1-node3 dockerd[789]: ......
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="Loading containers: done."
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="docker daemon: 1.6.1-rc2 17f157d-dirty; execdriver: native-0.2; graphdriver: overlay"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job acceptconnections()"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job acceptconnections() = OK (0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="Daemon has completed initialization"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="POST /v1.18/containers/create"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job create()"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job log(create, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job log(create, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job create() = OK (0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="POST /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/attach?stderr=1&stdout=1&stream=1"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job container_inspect(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job container_inspect(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job attach(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="POST /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/start"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job start(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="+job log(start, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job log(start, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:15:29 kube1-node3 dockerd[789]: time="2015-11-25T22:15:29Z" level=info msg="-job start(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="POST /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/kill?signal=TERM"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job kill(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, TERM)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job kill(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, TERM) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="POST /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/kill?signal=TERM"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job kill(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, TERM)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job kill(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, TERM) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job log(die, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job log(die, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="POST /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/wait"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job wait(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job attach(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job wait(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="GET /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb/json"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job container_inspect(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job container_inspect(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="DELETE /v1.18/containers/a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb?v=1"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job rm(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="+job log(destroy, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job log(destroy, a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="Volume /run/flannel is being used and cannot be removed: used by containers [4bd6332de8bb7862fe3ae699b8f27efc66d828df912c3aceae003aec2b6b4899 cd4d8150f7ff59bb7eaa07d6fb470c47f2f7d0b9f47e85b821e573ae77fbb07b 687d3bda4a6ca8d88373bb5b3c6d26c14d23d4d6afea0b2d363f35fa0b3638c0]"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="Volume /usr/share/ca-certificates is being used and cannot be removed: used by containers [4bd6332de8bb7862fe3ae699b8f27efc66d828df912c3aceae003aec2b6b4899 cd4d8150f7ff59bb7eaa07d6fb470c47f2f7d0b9f47e85b821e573ae77fbb07b 687d3bda4a6ca8d88373bb5b3c6d26c14d23d4d6afea0b2d363f35fa0b3638c0]"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="Volume /etc/ssl/etcd is being used and cannot be removed: used by containers [4bd6332de8bb7862fe3ae699b8f27efc66d828df912c3aceae003aec2b6b4899 cd4d8150f7ff59bb7eaa07d6fb470c47f2f7d0b9f47e85b821e573ae77fbb07b 687d3bda4a6ca8d88373bb5b3c6d26c14d23d4d6afea0b2d363f35fa0b3638c0]"
Nov 25 22:16:59 kube1-node3 dockerd[789]: time="2015-11-25T22:16:59Z" level=info msg="-job rm(a3df26dfe79ea6cd7b19516a974e8f1c44335e9b0c8a602ffbda22e43e08d3cb) = OK (0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="POST /v1.18/containers/create"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job create()"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job log(create, 3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="-job log(create, 3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="-job create() = OK (0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="POST /v1.18/containers/3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5/attach?stderr=1&stdout=1&stream=1"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job container_inspect(3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="-job container_inspect(3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5) = OK (0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job attach(3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="POST /v1.18/containers/3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5/start"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job start(3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="+job log(start, 3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5, quay.io/coreos/flannel:0.4.0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="-job log(start, 3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5, quay.io/coreos/flannel:0.4.0) = OK (0)"
Nov 25 22:17:04 kube1-node3 dockerd[789]: time="2015-11-25T22:17:04Z" level=info msg="-job start(3101c9b1af4d9ec58893836fb8398bf1ada58a01b0b93b0e89423f4e8247a0c5) = OK (0)"
esecules commented 8 years ago

SOLVED: I had flannel's FLANNELD_ETCD_ENDPOINTS variable set to the wrong endpoints