Open dogopupper opened 6 years ago
Is this only on one node or on several?
That path should be mapped (by the DaemonSet) from /var/lib/weave
- could you run lsof
on the host and see if something else has the file open?
several nodes:
weave-net-4cr6k 1/2 CrashLoopBackOff 822 3d
weave-net-98wv4 2/2 Running 45 3d
weave-net-9mj8d 1/2 Running 40 3d
weave-net-cdjzb 1/2 Running 18 3d
weave-net-f6tzq 2/2 Running 1 3d
weave-net-lkzn7 1/2 CrashLoopBackOff 823 3d
weave-net-rtnwm 2/2 Running 2 3d
weave-net-tdpfm 1/2 CrashLoopBackOff 57 3d
weave-net-v9fcx 2/2 Running 3 3d
weave-net-x9cjn 2/2 Running 4 3d
weave-net-xrjgb 1/2 CrashLoopBackOff 821 3d
only the ones with 2/2 work, about 40% of them
It appears to be bound to weave-kube
, the weave container named weave-kube
as well:
Volumes:
weavedb:
Type: HostPath (bare host directory volume)
Path: /var/lib/weave-kube
lsof - node with healthy weave: multiple entries of
weaver 19727 19892 root mem-W REG 8,9 1058458 /weavedb/weave-netdata.db (stat: No such file or directory)
weaver 19727 19892 root 11uW REG 8,9 65536 1058458 /weavedb/weave-netdata.db
lsof - node with unhealthy weave: same as above but additional entries of:
weaveutil 23488 root 3u REG 8,9 65536 534136 /weavedb/weave-netdata.db
weaveutil 23488 976 root 3u REG 8,9 65536 534136 /weavedb/weave-netdata.db
weaveutil 23488 2558 root 3u REG 8,9 65536 534136 /weavedb/weave-netdata.db
weaveutil 23488 3413 root 3u REG 8,9 65536 534136 /weavedb/weave-netdata.db
weaveutil 23488 23489 root 3u REG 8,9 65536 534136 /weavedb/weave-netdata.db
what is weaveutil? seems like the culprit here
weaveutil
is where we put various bits of logic we moved from shell-script to Go. In every case it should run briefly and then exit.
It does open weave-netdata.db
to fetch an ID, but again that should be very brief.
Also I would not expect more than one of them at a time; could be an artefact of Kubernetes crash-loop restart?
Since it's written in Go, sending it a SIGQUIT should make it dump a stack trace to stderr then exit. That stderr should then be in the weave
container logs. Please try this and see what you get. Probably the oldest weaveutil
on a host is the most interesting one.
kubectl logs -c weave and docker logs
It would be in the log of the container where weaveutil
was running. Maybe something else is going on - could you be running another instance of weaver
(the daemon itself) on that machine?
there doesn't seem to be any weaver daemon running in the nodes...
You're grepping the binary - the control plane for the Weave Network. The binary contains all possible error messages.
there doesn't seem to be any weaver daemon running in the nodes...
You showed weaver
in the output of lsof
above. It would be visible with a ps -eaf
or similar run on the host.
only 1 weaver running-
$ ps -eaf | grep weaver root 8532 1 6 Feb22 ? 05:39:56 /home/weave/weaver --port=6783 --datapath=datapath --name=46:d9:39:96:93:ef --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr1-4-1519310384 --ipalloc-init consensus=12 --conn-limit=100 --expect-npc 10.118.10.92 10.118.10.93 10.118.10.94 10.118.10.88 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
non-working weaver:
$ sudo cat /proc/8532/stack
[
working weaver:
[
The stack trace of interest is in weaveutil
- it's holding the file open when it ought to exit very quickly. weaver
has printed an error message so we know what it thinks. (Although I would also expect weaver
to exit after that error).
Next, the trouble with a raw stacktrace from the process is that Go multiplexes different parts of the program ("goroutines") onto the same OS thread (LWP). I guess it's possible if you printed out stacktraces from all the LWPs in the process one of them might show something interesting.
there doesn't seem to be a weaveutil running now in either the working or nonworking nodes:
core@devtoolskubernetes-kubernetes-cr2-4-1519307879 ~ $ ps -eaf | grep weave root 5858 5841 0 Feb22 ? 00:00:00 /home/weave/runsvinit root 9443 9425 0 Feb22 ? 00:09:24 /usr/bin/weave-npc root 10954 10935 0 Feb22 ? 00:00:00 /bin/sh /home/weave/launch.sh root 11015 10954 4 Feb22 ? 04:44:55 /home/weave/weaver --port=6783 --datapath=datapath --name=a6:06:dd:ed:09:9c --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr2-4-1519307879 --ipalloc-init consensus=10 --conn-limit=100 --expect-npc 10.118.10.92 10.118.10.93 10.118.10.94 10.118.10.88 10.118.10.89 10.118.10.90 10.118.10.91 10.118.10.104 10.118.10.105 10.118.10.106 core 13654 11847 0 13:25 pts/0 00:00:00 grep --colour=auto weave
i'm now getting the following entries with lsof: core@devtoolskubernetes-kubernetes-cr1-3-1519310384 ~ $ sudo lsof | grep weavedb lsof: no pwd entry for UID 100
I guess that's printed in place of the weaveutil entries?
Also, how to print stacktraces from all LWPs?
lsof: no pwd entry for UID 100
That's a new one on me.
how to print stacktraces from all LWPs?
LWPs are pretty much processes that share the same address space. You can list them with ps -eLf
for instance, and then cat /proc/pid/stack
as before.
Could you try rebooting one of the non-working nodes? If it's a transient thing that ought to shake it loose, and if fails again that would also be interesting.
rebooted the non working nodes, now weave is running in all of them and cluster is back to a healthy state.
If it shows again i'll paste the stacktraces here. Thanks for the help!
Happens again in a few nodes:
$ ps -elF | grep weave
4 S root 324 32700 0 80 0 - 110460 - 9468 6 12:49 ? 00:00:03
/usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
$ sudo cat /proc/324/stack
[<ffffffff890f6d27>] futex_wait_queue_me+0xc7/0x120
[<ffffffff890f7546>] futex_wait+0xf6/0x250
[<ffffffff890f978f>] do_futex+0x10f/0xb10
[<ffffffff890fa211>] SyS_futex+0x81/0x190
[<ffffffff89003949>] do_syscall_64+0x59/0x1c0
[<ffffffff89800115>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
Looks like a deadlock?
Most other weave-net processes have the same stacktrace, a few have this variation:
$ sudo cat /proc/2896/stack
[<ffffffff89262f13>] ep_poll+0x2f3/0x3b0
[<ffffffff892647f9>] SyS_epoll_wait+0xb9/0xd0
[<ffffffff89003949>] do_syscall_64+0x59/0x1c0
[<ffffffff89800115>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
the rest:
4 S root 2896 6225 0 80 0 - 64229 - 10312 6 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
0 S core 15347 15015 0 80 0 - 1684 - 908 7 13:22 pts/0 00:00:00 grep --colour=auto weave
4 S root 16580 6225 0 80 0 - 30028 - 8360 0 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16586 6225 0 80 0 - 29348 - 8360 2 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16591 6225 0 80 0 - 29412 - 8240 0 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16597 6225 0 80 0 - 45796 - 8244 5 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16601 6225 0 80 0 - 27299 - 8196 5 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16603 6225 0 80 0 - 46412 - 8228 1 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16608 6225 0 80 0 - 64845 - 10400 3 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16617 6225 0 80 0 - 29348 - 8300 4 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16621 6225 0 80 0 - 64229 - 8256 2 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16742 6225 0 80 0 - 64229 - 10340 0 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16749 6225 0 80 0 - 47781 - 10336 5 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16759 6225 0 80 0 - 64229 - 10284 2 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 16953 6225 0 80 0 - 47845 - 8196 7 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 17014 6225 0 80 0 - 64581 - 8244 7 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 17125 17103 0 80 0 - 208862 - 59820 1 Feb26 ? 00:01:47 /usr/bin/weave-npc
4 S root 19040 6225 0 80 0 - 64229 - 8296 0 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 19047 6225 0 80 0 - 27299 - 8236 4 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 19311 6225 0 80 0 - 64493 - 8240 0 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 19659 6225 0 80 0 - 47845 - 8192 7 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 19712 6225 0 80 0 - 48133 - 8228 7 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 D root 22097 1 0 80 0 - 239040 - 44460 7 Feb26 ? 00:00:18 /home/weave/weaver --port=6783 --datapath=datapath --name=9a:60:a6:d9:ce:a8 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr0-1-1519311725 --ipalloc-init consensus=11 --conn-limit=100 --expect-npc 10.118.10.112 10.118.10.113 10.118.10.114 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
4 S root 24234 6225 0 80 0 - 64229 - 10344 7 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 30085 6225 0 80 0 - 47845 - 8288 1 Feb26 ? 00:00:00 /opt/weave-net/bin/weave-net
4 S root 32700 32681 0 80 0 - 386 - 892 7 12:49 ? 00:00:00 /bin/sh /home/weave/launch.sh
/usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
That's useful - I had thought it was a different execution of weaveutil
I'm not clear what that "the rest" list shows.
To make progress I really need to know how we get two processes on the same machine that are both trying to access weave-netdata.db
at the same time. What are those processes and where did they come from (are they both in the same container? If not, who started the different containers?)
Looks like a deadlock?
No, that's just a Go program waiting for something to happen. Stack traces from the OS' perspective do not show what a Go program is actually doing, because it switches different "green threads" around onto the OS threads.
that's the rest of ps -elF output.
weave runs as a daemonset, a kubectl get pods -o wide | grep weave
shows distinct nodes.
what else could be trying to access weavedb at initialization within the container?
from what I understand the process tree is: weaver -> weave-npc -> weaveutil ? weave-npc seems to operate normally in all pods.
weaveutil also spawns threads, not sure if this helps:
$ ps -eLf | grep weavedb
root 6196 6096 6196 0 11 13:17 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 6197 0 11 13:17 ? 00:00:01 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 6198 0 11 13:17 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 6199 0 11 13:17 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 6200 0 11 13:17 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 17913 0 11 13:42 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 17914 0 11 13:42 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 17915 0 11 13:42 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 17916 0 11 13:42 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 18918 0 11 13:44 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
root 6196 6096 18919 0 11 13:44 ? 00:00:00 /usr/bin/weaveutil set-db-flag /weavedb/weave-net peer-reclaim ok
Could you give us Go stack traces of the stuck weaveutil
process? You can get the traces by:
lslocks
to get a PID of a process holding a lock (flock(2)
) for weave-netdata.db
. Faster than with lsof
.strace -e write -s100000 -fp $PID 2> /tmp/weaveutil-strace
.kill -SIGQUIT $PID
./tmp/weaveutl-strace
.Thanks for the steps, couldn't find how to produce that.
I found that there was a hacky init-container in place that reset the weavedb before starting weave, it was a workaround for a bug that was recently fixed. I've now rebooted the nodes and removed the initcontainer so I don't have a stuck weaveutil until it's hit again - maybe that initcontainer was the problem.
I'll update with the stacktraces if it's hit again.
Hit the bug again, got the strace this time ^^
let me know if it's sufficient.
core@devtoolskubernetes-kubernetes-cr0-1-1519311725 ~ $ sudo strace -e write -s100000 -fp 522 2> weaveutil-strace
-- ran kill SIGQUIT from another pty --
core@devtoolskubernetes-kubernetes-cr0-1-1519311725 ~ $ cat weaveutil-strace
Process 522 attached with 5 threads
[pid 522] --- SIGQUIT {si_signo=SIGQUIT, si_code=SI_USER, si_pid=4257, si_uid=0} ---
[pid 522] write(2, "SIGQUIT: quit", 13) = 13
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "PC=", 3) = 3
[pid 522] write(2, "0x457233", 8) = 8
[pid 522] write(2, " m=", 3) = 3
[pid 522] write(2, "0", 1) = 1
[pid 522] write(2, " sigcode=", 9) = 9
[pid 522] write(2, "0", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "goroutine ", 10) = 10
[pid 522] write(2, "6", 1) = 1
[pid 522] write(2, " [", 2) = 2
[pid 522] write(2, "syscall", 7) = 7
[pid 522] write(2, "]:\n", 3) = 3
[pid 522] write(2, "runtime.notetsleepg", 19) = 19
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0xc74958", 8) = 8
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x2faeea8", 9) = 9
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x16", 4) = 4
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/lock_futex.go", 39) = 39
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "205", 3) = 3
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x42", 4) = 4
[pid 522] write(2, " fp=", 4) = 4
[pid 522] write(2, "0xc42002c760", 12) = 12
[pid 522] write(2, " sp=", 4) = 4
[pid 522] write(2, "0xc42002c730", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "runtime.timerproc", 17) = 17
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/time.go", 33) = 33
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "209", 3) = 3
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x327", 5) = 5
[pid 522] write(2, " fp=", 4) = 4
[pid 522] write(2, "0xc42002c7e0", 12) = 12
[pid 522] write(2, " sp=", 4) = 4
[pid 522] write(2, "0xc42002c760", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "runtime.goexit", 14) = 14
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/asm_amd64.s", 37) = 37
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "2197", 4) = 4
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x1", 3) = 3
[pid 522] write(2, " fp=", 4) = 4
[pid 522] write(2, "0xc42002c7e8", 12) = 12
[pid 522] write(2, " sp=", 4) = 4
[pid 522] write(2, "0xc42002c7e0", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "created by ", 11) = 11
[pid 522] write(2, "runtime.addtimerLocked", 22) = 22
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/time.go", 33) = 33
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "116", 3) = 3
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0xed", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "goroutine ", 10) = 10
[pid 522] write(2, "1", 1) = 1
[pid 522] write(2, " [", 2) = 2
[pid 522] write(2, "sleep", 5) = 5
[pid 522] write(2, "]:\n", 3) = 3
[pid 522] write(2, "time.Sleep", 10) = 10
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0x2faf080", 9) = 9
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/time.go", 33) = 33
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "59", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0xf9", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "github.com/weaveworks/weave/vendor/github.com/boltdb/bolt.flock", 63) = 63
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0xc4200f8a80", 12) = 12
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x1000001b0", 11) = 11
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x1b0", 5) = 5
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0xc42000e470", 12) = 12
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/go/src/github.com/weaveworks/weave/vendor/github.com/boltdb/bolt/bolt_unix.go", 78) = 78
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "38", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0xdd", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "github.com/weaveworks/weave/vendor/github.com/boltdb/bolt.Open", 62) = 62
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0xc42028a3e0", 12) = 12
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x19", 4) = 4
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x1b0", 5) = 5
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0xc97b20", 8) = 8
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x7", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0xc42028a3e0", 12) = 12
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x19", 4) = 4
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/go/src/github.com/weaveworks/weave/vendor/github.com/boltdb/bolt/db.go", 71) = 71
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "181", 3) = 3
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x173", 5) = 5
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "github.com/weaveworks/weave/db.NewBoltDB", 40) = 40
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0x7fff58a88aef", 14) = 14
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x12", 4) = 4
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x2", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x2", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0xc4201a9e80", 12) = 12
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/go/src/github.com/weaveworks/weave/db/boltdb.go", 48) = 48
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "38", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x9e", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "main.setDBFlag", 14) = 14
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0xc420010110", 12) = 12
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x3", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x3", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, ", ", 2) = 2
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/go/src/github.com/weaveworks/weave/prog/weaveutil/db_flag.go", 61) = 61
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "44", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0xb7", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "main.main", 9) = 9
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/go/src/github.com/weaveworks/weave/prog/weaveutil/main.go", 58) = 58
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "86", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x27b", 5) = 5
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "goroutine ", 10) = 10
[pid 522] write(2, "17", 2) = 2
[pid 522] write(2, " [", 2) = 2
[pid 522] write(2, "syscall", 7) = 7
[pid 522] write(2, ", locked to thread", 18) = 18
[pid 522] write(2, "]:\n", 3) = 3
[pid 522] write(2, "runtime.goexit", 14) = 14
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/asm_amd64.s", 37) = 37
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "2197", 4) = 4
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x1", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "goroutine ", 10) = 10
[pid 522] write(2, "5", 1) = 1
[pid 522] write(2, " [", 2) = 2
[pid 522] write(2, "syscall", 7) = 7
[pid 522] write(2, "]:\n", 3) = 3
[pid 522] write(2, "os/signal.signal_recv", 21) = 21
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/runtime/sigqueue.go", 37) = 37
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "116", 3) = 3
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x104", 5) = 5
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "os/signal.loop", 14) = 14
[pid 522] write(2, "(", 1) = 1
[pid 522] write(2, ")\n", 2) = 2
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/os/signal/signal_unix.go", 42) = 42
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "22", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x22", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "created by ", 11) = 11
[pid 522] write(2, "os/signal.init.1", 16) = 16
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\t", 1) = 1
[pid 522] write(2, "/usr/local/go/src/os/signal/signal_unix.go", 42) = 42
[pid 522] write(2, ":", 1) = 1
[pid 522] write(2, "28", 2) = 2
[pid 522] write(2, " +", 2) = 2
[pid 522] write(2, "0x41", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rax ", 7) = 7
[pid 522] write(2, "0xfffffffffffffffc", 18) = 18
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rbx ", 7) = 7
[pid 522] write(2, "0x2faeea8", 9) = 9
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rcx ", 7) = 7
[pid 522] write(2, "0x457233", 8) = 8
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rdx ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rdi ", 7) = 7
[pid 522] write(2, "0xc74958", 8) = 8
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rsi ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rbp ", 7) = 7
[pid 522] write(2, "0xc42002c6e8", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rsp ", 7) = 7
[pid 522] write(2, "0xc42002c6a0", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r8 ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r9 ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r10 ", 7) = 7
[pid 522] write(2, "0xc42002c6d8", 12) = 12
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r11 ", 7) = 7
[pid 522] write(2, "0x202", 5) = 5
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r12 ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r13 ", 7) = 7
[pid 522] write(2, "0x8", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r14 ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "r15 ", 7) = 7
[pid 522] write(2, "0xf3", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rip ", 7) = 7
[pid 522] write(2, "0x457233", 8) = 8
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "rflags ", 7) = 7
[pid 522] write(2, "0x202", 5) = 5
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "cs ", 7) = 7
[pid 522] write(2, "0x33", 4) = 4
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "fs ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 522] write(2, "gs ", 7) = 7
[pid 522] write(2, "0x0", 3) = 3
[pid 522] write(2, "\n", 1) = 1
[pid 524] <... futex resumed> ) = ? <unavailable>
[pid 526] <... futex resumed> ) = ? <unavailable>
[pid 524] +++ exited with 2 +++
[pid 526] +++ exited with 2 +++
[pid 523] +++ exited with 2 +++
[pid 525] +++ exited with 2 +++
+++ exited with 2 +++
lslocks didnt have any reference to weaveutil or weavedb:
core@devtoolskubernetes-kubernetes-cr0-1-1519311725 ~ $ lslocks
COMMAND PID TYPE SIZE MODE M START END PATH
weaver 14741 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/hosts...
etcd 6053 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/hosts...
(unknown) -1 OFDLCK READ 0 0 0
kubelet 19446 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/hosts...
(unknown) -1 OFDLCK READ 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/dev...
locksmithd 4924 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/resolv.conf...
kubelet 19446 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/resolv.conf...
dockerd 21960 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/b8205c3e-6170-4ba5-b472-867bf20f13e4/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/hosts...
Thanks for the strace. I see that the weaveutil
process is sleeping after an unsuccessful attempt
to acquire the lock: https://github.com/boltdb/bolt/blob/c6ba97b89e0454fec9aa92e1d33a4e2c5fc1f631/bolt_unix.go#L38
From the lslocks
output it looks that you run everything in rkt containers. Is it correct?
Yes, everything is in rkt, the k8s control plane itself also - it's self-hosted / masterless.
k8s pods run with docker.
Is there any chance docker/containerd might be the culprit here?
My suspicion is that flock(2)
is not working on your setup, as neither weaver
or weaveutil
are able to grab the lock.
Yes, everything is in rkt
Does it mean that Docker is running on top of rkt?
Also, is weaver
running inside a Docker container or a rkt container?
hmm, kubelet runs within rkt, and all pods in docker - weaver is running in docker. hyperkube images also run in docker managed by k8s, which run the rest of the k8s control plane - scheduler etcetera.
on the coreOS VM:
$ rkt list
UUID APP IMAGE NAME STATE CREATED STARTED NETWORKS
49125d7a hyperkube quay.io/coreos/hyperkube:v1.8.7_coreos.0 running 4 days ago 4 days ago
5d584892 etcd quay.docker.tech.lastmile.com/coreos/etcd:v2.3.8 running 4 days ago 4 days ago
-- etcd here is a proxy that speaks to the actual etcd nodes, these ones are separate --
$ docker ps | grep weave-net
d16e9dab2e03 hub.docker.tech.lastmile.com/weaveworks/weave-kube@sha256:2e37d2ea4fea7fa624bf46c42facf4afe4303a27ab1a1a84838e08a66898e610
$ docker exec -it d16e9dab2e03 "/bin/sh"
/home/weave # ls
kube-peers launch.sh restart.sentinel weave weaver
/home/weave # lslocks
COMMAND PID TYPE SIZE MODE M START END PATH
kubelet 6564 FLOCK 0B WRITE 0 0 0 /var/lib/rkt/pods/run/49125d7a-4595-4598-aa3f-e4fe4224427b/stage1/rootfs/opt/stage2/hyperkube/rootfs/run/lock/kubelet.lock
nrsysmond 16589 POSIX 0B WRITE 0 0 0 /sys/fs/cgroup/cpuset
(unknown) -1 OFDLCK 0B READ 0 0 0
kubelet 6564 FLOCK 4K WRITE 0 0 0 /var/lib/rkt/pods/run/49125d7a-4595-4598-aa3f-e4fe4224427b
dockerd 911 FLOCK 32K WRITE 0 0 0 /var/lib/docker/volumes/metadata.db
weaver 11718 FLOCK 64K WRITE 0 0 0 /weavedb/weave-netdata.db
runsv 8774 FLOCK 0B WRITE 0 0 0 /sys/fs/cgroup/cpuset
locksmithd 5778 FLOCK 45B WRITE 0 0 0 /run/update-engine/coordinator.conf
etcd 6381 FLOCK 4K WRITE 0 0 0 /var/lib/rkt/pods/run/5d584892-6b2d-4ba1-a497-453e8083b32c
(unknown) -1 OFDLCK 0B READ 0 0 0
(unknown) -1 OFDLCK 0B READ 0 0 0
runsv 8773 FLOCK 0B WRITE 0 0 0 /sys/fs/cgroup/cpuset
that's the lslocks output from within the weave docker container - it has locks to rkt kubelet land like
/var/lib/rkt/pods/run/49125d7a-4595-4598-aa3f-e4fe4224427b/stage1/rootfs/opt/stage2/hyperkube/rootfs/run/lock/kubelet.lock
Perhaps something's going wrong between all these, doesn't look like the simplest solution overall. I didn't create this, don't know much about the why and hows..
we are running into same issue as well after updating the our cluster few days back. In our case, only one node is the problem, all other nodes work just fine.
Our setup is:
Kubernetes: 1.8.5 AWS (deployed using kops 1.8.1) Docker version: 1.13.1 Ubuntu: 16.04.3 LTS Weave: 2.2.0
output from strace command:
strace -e write -s100000 -fp 8393 strace: Process 8393 attached with 17 threads [pid 8608] write(2, "WARN: 2018/03/02 04:30:33.558934 [allocator]: Delete: no addresses for e3cc08d8afd4c957e7c26ade74f8204955633df4de256360f8555e6004d66936\n", 136
@rajatjindal please open a new issue and post the full logs.
@dogopupper
ps -eLf > /tmp/foo && lslocks >> /tmp/foo && lsof >> /tmp/foo
and upload /tmp/foo
?docker uses overlay
(1), rkt is vanilla/unconfigured as-is with coreos, coreos site says it defaults to 'overlayfs'.
overlay2 should be better I suppose?
i'll post the outputs when they're available.
Issue came back, here's the output from the command you requested on one unhealthy node. There's 6/11 crashing at the mo. output.txt
Thanks! Could you run lsof
from root?
I'm going to leave this to @dogopupper, he's more up to speed on this than I am.
Got all outputs this time.
from lslocks:
COMMAND PID TYPE SIZE MODE M START END PATH
kubelet 3741 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/resolv.conf...
dockerd 14880 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/docker/overlay...
(unknown) -1 OFDLCK READ 0 0 0
kubelet 3741 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/docker/overlay...
(unknown) -1 OFDLCK READ 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/dev...
locksmithd 5528 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/resolv.conf...
etcd 6160 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/docker/overlay...
flock 15886 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/etc/resolv.conf...
weaver 28520 FLOCK WRITE 0 0 0 /var/lib/rkt/pods/run/9ddf2c83-0331-4cb3-a7da-a37385903322/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/docker/overlay...
(unknown) -1 OFDLCK READ 0 0 0
the locks in docker/overlay and etc/resolv.conf within the rkt hyperkube (kubelet) container are interesting and apparent in all nodes that are malfunctioning and weave gets in a crashloop (error syncing pod, probes failing etc) - logs of weave container have the same symptom as the title.
that whole rkt land path and docker overlay within rkt look interesting - the rkt container itself does not have a docker cli however by manually grepping the contents of /var/lib/docker/containers from within the rkt kubelet container, the contents were indeed the filesystems of the containers that run in pods that run in kubernetes itself.
as in, weaver
lives in /var/lib/rkt/pods/run/*hyperkube-kubelet-container*/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/docker/overlay/*podnumber*/home/weave/weaver
.
the docker service itself runs in the node.
observing the behaviour today, the weaveutil lock comes and goes but the crashloop remains - it doesn't seem so stuck on weaveutil - there was previously massive i/o wait in the etcd
nodes, that might have contributed in general I think. I patched etcd by increasing heartbeat intervals and upgrading to latest 3.3.1, however there is indeed slow I/O on this cluster.
here's the full output I got a few hours ago (lsof from root, weaveutil present) outputsfull.txt
In our case it turned out prometheus was consuming all the resources available on that node (potentially because we configured a remote write on prometheus which was down and thus prometheus was queuing up)
Once i disabled remote write, everything is working since.
Edited:
found this issue in Prometheus issue list: https://github.com/prometheus/prometheus/issues/3809
@rajatjindal that's a great insight - we sometimes see Prometheus chewing up all memory on other clusters, but we don't really understand the mechanism since it's supposed to drop data when the queue gets over a certain size.
Did you log an issue with Prometheus?
resource utilization is very low in this specific cluster - about 2-20% cpu and 5-30% memory. there's an occassional 2-15% i/o wait in the k8s nodes every 30 secs or so.
I'll post some systemd units that might be related, on node configuration:
**- name: kubelet.service**
enabled: true
contents: |
[Service]
EnvironmentFile=-/run/metadata/coreos
EnvironmentFile=/etc/docker-environment
Environment="RKT_RUN_ARGS= \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=etc-cni,kind=host,source=/etc/cni,readOnly=true \
--mount volume=etc-cni,target=/etc/cni \
--volume=opt-cni,kind=host,source=/opt/cni,readOnly=true \
--mount volume=opt-cni,target=/opt/weave-net \
--volume=resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=resolv,target=/etc/resolv.conf \
--volume=var-log,kind=host,source=/var/log,readOnly=false \
--mount volume=var-log,target=/var/log \
"
Environment=IPTABLES_LOCK_FILE=/run/xtables.lock
Environment=KUBELET_ACI=quay.io/coreos/hyperkube
Environment=KUBELET_VERSION={{kubernetes_version}}
# iptables lock file must be created prior to containers trying to mount it
ExecStartPre=/usr/bin/touch ${IPTABLES_LOCK_FILE}
ExecStartPre=/usr/bin/chmod 0600 ${IPTABLES_LOCK_FILE}
ExecStartPre=/usr/bin/chown root:root ${IPTABLES_LOCK_FILE}
ExecStartPre=/bin/mkdir -p /etc/cni
ExecStartPre=/bin/mkdir -p /opt/cni
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /srv/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
**ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid**
# Download a copy of the aci from our local mirror, due to really slow http_proxy
{%- if cloud_provider == 'openstack' %}
ExecStartPre=/bin/rkt fetch --insecure-options=image P_S3_ENDPOINT_URL/kubernetes-bootstrap/acis/hyperkube/{{kubernetes_version}}.aci
{% endif %}
ExecStart=/bin/bash -c '/usr/lib/coreos/kubelet-wrapper \
--kubeconfig=/etc/kubernetes/kubeconfig \
--require-kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--exit-on-lock-contention \
--pod-manifest-path=/etc/kubernetes/manifests \
--allow-privileged \
{%- if cloud_provider == 'openstack' %}
--cloud-provider=openstack \
{%- else %}
--cloud-provider=gce \
{%- endif %}
--cloud-config=/etc/kubernetes/cloud_config \
--node-labels=master=true \
--minimum-container-ttl-duration=6m0s \
--cluster_dns=172.31.128.2 \
--cluster_domain=cluster.local \
--network-plugin=cni \
--pod-infra-container-image=${GCR_DOCKER_MIRROR}/google_containers/pause-amd64:3.0 \
--enforce-node-allocatable=pods \
**--kube-reserved=cpu=250m,memory=1Gi,ephemeral-storage=20Gi \
--system-reserved=cpu=100m,memory=600Mi,ephemeral-storage=2Gi \
--eviction-hard=memory.available\\<500Mi,nodefs.available\\<10% \
'
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid**
Restart=always
RestartSec=10
Slice=podruntime.slice
**17 - name: podruntime.slice**
18 contents: |
19 [Unit]
20 Description=Pod Runtime Slice
21 Documentation=https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md#recommended-cgroups-setup
**59 - name: containerd.service**
60 dropins:
61 - name: 50-containerd-opts.conf
62 contents: |
63 [Service]
64 Slice=podruntime.slice
**65 - name: docker.service**
66 dropins:
67 - name: 50-docker-opts.conf
68 contents: |
69 [Unit]
70 After=update-ca-certificates.service
71 Requires=update-ca-certificates.service
72 [Service]
73 {%- if cloud_provider == 'openstack' %}
74 Environment=http_proxy=http://proxy.ocado.com:8080
75 Environment=https_proxy=http://proxy.ocado.com:8080
76 Environment=no_proxy=.ocado.com,.lastmile.com
77 {%- endif %}
78 # disabling selinux for now due to moby/moby#32892
79 Environment="DOCKER_OPTS=--log-opt max-size={{log_size}} --log-opt max-file=2 --bip=172.31.255.1/24 --storage-driver=overlay2 --selinux-enabled=false"
80 Slice=podruntime.slice
- filesystem: "root"
**path: /var/lib/docker-healthcheck**
mode: 0755
contents:
inline: |
#!/bin/bash
# Taken from https://github.com/kubernetes/kops/blob/2b723b2a649402b00f45565158011384674449f4/vendor/k8s.io/kubernetes/cluster/saltbase/salt/docker/docker-healthcheck
# This script is intended to be run periodically, to check the health
# of docker. If it detects a failure, it will restart docker using systemctl.
set -x
return_code=1
if timeout 10 docker ps > /dev/null; then
echo "docker healthcheck script started - docker ok."
return_code=0
else
echo "docker failed"
echo "Giving docker 30 seconds grace before restarting"
sleep 30
if timeout 10 docker ps > /dev/null; then
echo "docker recovered"
return_code=0
else
echo "docker still down; triggering docker restart"
timeout 60 systemctl stop docker
timeout 20 systemctl stop -f docker
timeout 60 systemctl stop containerd
timeout 20 systemctl stop -f containerd
timeout 60 systemctl start containerd
timeout 60 systemctl start docker
echo "Waiting 60 seconds to give docker time to start"
sleep 60
if timeout 10 docker ps > /dev/null; then
echo "docker recovered"
return_code=0
else
echo "docker still failing"
fi
fi
fi
echo "docker healthcheck complete (return code: $return_code)"
exit $return_code
@dogopupper i think another issue here might be that if at some point system was under load and impacted weave, weave might have failed to recover from that state?
devtoolskubernetes-kubernetes-cr0-1-1519311725.txt
There's some interesting features here (excerpts from the above log)
A weaver
process, that's been reparented to init:
root 16424 1 16424 0 26 Mar07 ? 00:00:00 /home/weave/weaver --port=6783 --datapath=datapath --name=9a:60:a6:d9:ce:a8 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:67
82 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr0-1-1519311725 --ipalloc-init consensus=11 --conn-limit=100 --expect-npc 1
0.118.10.112 10.118.10.113 10.118.10.114 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
A lock, held by the above weaver
:
weaver 16424 FLOCK 64K WRITE 0 0 0 /weavedb/weave-netdata.db
I wonder how that weaver
has managed to survive the termination of it's associated docker container.
I tried obtaining the current strace of the weaver process, (using the above SIGQUIT), however strace hung (it wouldn't respond to control+c).
The weaver
process actually refuses to die:
devtoolskubernetes-kubernetes-cr0-1-1519311725 core # ps -ef | grep -i weaver
root 16270 7109 0 17:58 pts/0 00:00:00 grep --colour=auto -i weaver
root 16424 1 2 Mar07 ? 00:35:22 /home/weave/weaver --port=6783 --datapath=datapath --name=9a:60:a6:d9:ce:a8 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr0-1-1519311725 --ipalloc-init consensus=11 --conn-limit=100 --expect-npc 10.118.10.112 10.118.10.113 10.118.10.114 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
devtoolskubernetes-kubernetes-cr0-1-1519311725 core # kill -9 16424
devtoolskubernetes-kubernetes-cr0-1-1519311725 core # ps -ef | grep -i weaver
root 16296 7109 0 17:58 pts/0 00:00:00 grep --colour=auto -i weaver
root 16424 1 2 Mar07 ? 00:35:22 /home/weave/weaver --port=6783 --datapath=datapath --name=9a:60:a6:d9:ce:a8 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr0-1-1519311725 --ipalloc-init consensus=11 --conn-limit=100 --expect-npc 10.118.10.112 10.118.10.113 10.118.10.114 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
Indeed, it's in uninterruptible sleep:
16424 ? Dl 35:22 /home/weave/weaver --port=6783 --datapath=datapath --name=9a:60:a6:d9:ce:a8 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=172.31.0.0/17 --nickname=devtoolskubernetes-kubernetes-cr0-1-1519311725 --ipalloc-init consensus=11 --conn-limit=100 --expect-npc 10.118.10.112 10.118.10.113 10.118.10.114 10.118.10.108 10.118.10.109 10.118.10.110 10.118.10.111 10.118.10.104 10.118.10.105 10.118.10.107 10.118.10.106
Kernel stack:
devtoolskubernetes-kubernetes-cr0-1-1519311725 fd # cat /proc/16424/stack
[<ffffffff920e1923>] __refrigerator+0x73/0x160
[<ffffffff92084c76>] get_signal+0x5c6/0x5d0
[<ffffffff9202aa36>] do_signal+0x36/0x610
[<ffffffff92003011>] exit_to_usermode_loop+0x71/0xb0
[<ffffffff920039d9>] do_syscall_64+0xe9/0x1c0
[<ffffffff92800115>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
cgroup frozen status:
devtoolskubernetes-kubernetes-cr0-1-1519311725 16424 # cat /sys/fs/cgroup/freezer/kubepods/burstable/podfe44ab7d-1bcb-11e8-8c21-fa163e8e4409/35f7326fc1bee03c4f669569eb91d9241d7659060be582e5740bdaab9911544d/freezer.state
FROZEN
devtoolskubernetes-kubernetes-cr0-1-1519311725 16424 # echo THAWED > /sys/fs/cgroup/freezer/kubepods/burstable/podfe44ab7d-1bcb-11e8-8c21-fa163e8e4409/35f7326fc1bee03c4f669569eb91d9241d7659060be582e5740bdaab9911544d/freezer.state
# And it finally dies!
@rajatjindal that's unlikely - there are (aggressive) resource limits everywhere and the cluster doesn't run anything resource intensive.. e.g. the most resource hungry thing running is Weave Scope
@mikebryant Great findings! Any idea how the weaver process got frozen (maybe there are some entries in journald or dmesg logs)?
Is this still an issue?
We (@mikebryant and I) have experienced this again today, currently just restarting nodes to get past it. All nodes were in one openstack az, so could be some disk slowness issue triggering it?
@stuart-warren Do you have any logs from the event?
@brb I wasn't able to capture anything new that isn't already mentioned here
I found this thread when searching for [boltDB] Unable to open /weavedb/weave-netdata.db: timeout
but in my case, @rajatjindal seemed to be on the right track.
After load testing our environment (kops-based aws, 12 nodes, not much else of note), we'll often have 1 or 2 nodes in this state where weave is in a crash loop with the above error. We just kill the node when it happens but it'd be nice for something more automated.
What you expected to happen?
Weave to be working 🌮
What happened?
Cluster's docker version was downgraded from 17.06 to 1.12.6 a few days ago but seemed to be working normally, and it seems weave broke over the weekend.
logs of the weave container consist solely of:
[boltDB] Unable to open /weavedb/weave-netdata.db: timeout
How to reproduce it?
no idea
Anything else we need to know?
k8s 1.8.7 on openstack docker 1.12.6 weave 2.2.0 coreOS stable 1632.2.1
Versions:
see above ^^
Logs:
weave logs:
[boltDB] Unable to open /weavedb/weave-netdata.db: timeout
kubelet logs:
bash[4519]: weave-cni: unable to release IP address: Delete http://127.0.0.1:6784/ip/499c90a7945d984024acf1fbc87fde11d9b0ee496c4ffe7d98c7509a7af2c001: dial tcp 127.0.0.1:6784: i/o timeout
Network:
Taken from a k8s node:
$ ip route
$ ip -4 -o addr
$ sudo iptables-save