Closed balopat closed 5 years ago
The sample project that I'm deploying is here: https://github.com/balopat/contribwall This has two services 1.) a go based one - that happily runs in runsc, and 2.) a reactjs frontend - that seems to have issues when ran in runsc - for example even though the app comes up and reports that it's listening on port 3000 - it can't accept connections for some reason, even from within the container curl localhost:3000 doesn't work. Maybe there is an unhandled syscall in the reactjs stack somewhere?
There are a few problems with inotify that this code is hitting:
goroutine 106 [semacquire, 3 minutes]:
sync.runtime_SemacquireMutex(0xc0001b000c, 0xc0007da900)
third_party/go/gc/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc0001b0008)
third_party/go/gc/src/sync/mutex.go:134 +0xff
google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(*Inotify).Readiness(0xc0001b0000, 0x400019, 0xc000a90000)
third_party/golang/gvisor/pkg/sentry/fs/inotify.go:106 +0x41
google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(*File).Readiness(0xc0005d8f30, 0x19, 0x0)
third_party/golang/gvisor/pkg/sentry/fs/file.go:180 +0x3e
google3/third_party/golang/gvisor/pkg/sentry/kernel/epoll/epoll.(*EventPoll).ReadEvents(0xc00076adc0, 0x400, 0xc0006cb050, 0x0, 0x7fc3bafe0780)
third_party/golang/gvisor/pkg/sentry/kernel/epoll/epoll.go:241 +0x14a
google3/third_party/golang/gvisor/pkg/sentry/syscalls/syscalls.WaitEpoll(0xc00070aa80, 0x3, 0x400, 0x2f, 0x0, 0x0, 0x0, 0x0, 0x0)
third_party/golang/gvisor/pkg/sentry/syscalls/epoll.go:139 +0xce
google3/third_party/golang/gvisor/pkg/sentry/syscalls/linux/linux.EpollWait(0xc00070aa80, 0x3, 0x7fc3bafdd750, 0x400, 0x2f, 0x0, 0x7fc3bafe0780, 0x0, 0x7fc3bafe0780, 0xc000000003, ...)
third_party/golang/gvisor/pkg/sentry/syscalls/linux/sys_epoll.go:140 +0x68
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).executeSyscall(0xc00070aa80, 0xe8, 0x3, 0x7fc3bafdd750, 0x400, 0x2f, 0x0, 0x7fc3bafe0780, 0x0, 0xc3ae00, ...)
third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:165 +0x30a
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).doSyscallInvoke(0xc00070aa80, 0xe8, 0x3, 0x7fc3bafdd750, 0x400, 0x2f, 0x0, 0x7fc3bafe0780, 0x0, 0x7fc3bafe0780)
third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:283 +0x69
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).doSyscallEnter(0xc00070aa80, 0xe8, 0x3, 0x7fc3bafdd750, 0x400, 0x2f, 0x0, 0x7fc3bafe0780, 0xd1ee60, 0xc00060fe58)
third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:244 +0x99
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).doSyscall(0xc00070aa80, 0x2, 0xc00057c000)
third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:219 +0x142
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*runApp).execute(0x0, 0xc00070aa80, 0xd1ee60, 0x0)
third_party/golang/gvisor/pkg/sentry/kernel/task_run.go:215 +0xfda
google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).run(0xc00070aa80, 0x12)
third_party/golang/gvisor/pkg/sentry/kernel/task_run.go:91 +0x149
created by google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(*Task).Start
third_party/golang/gvisor/pkg/sentry/kernel/task_start.go:279 +0xfe
goroutine 3085 [semacquire, 3 minutes]: sync.runtime_SemacquireMutex(0xc00076ae0c, 0x408900) third_party/go/gc/src/runtime/sema.go:71 +0x3d sync.(Mutex).Lock(0xc00076ae08) third_party/go/gc/src/sync/mutex.go:134 +0xff google3/third_party/golang/gvisor/pkg/sentry/kernel/epoll/epoll.(readyCallback).Callback(0x136da70, 0xc000549220) third_party/golang/gvisor/pkg/sentry/kernel/epoll/epoll.go:290 +0x62 google3/third_party/golang/netstack/waiter/waiter.(Queue).Notify(0xc0001b0010, 0xd20001) third_party/golang/netstack/waiter/waiter.go:192 +0xa1 google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(Inotify).queueEvent(0xc0001b0000, 0xc0004e6550) third_party/golang/gvisor/pkg/sentry/fs/inotify.go:229 +0xeb google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(Watch).Notify(0xc000b173e0, 0x0, 0x0, 0x40000001) third_party/golang/gvisor/pkg/sentry/fs/inotify_watch.go:88 +0x8b google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(Watches).Notify(0xc000480ff0, 0x0, 0x0, 0x40000001) third_party/golang/gvisor/pkg/sentry/fs/inode_inotify.go:131 +0x104 google3/third_party/golang/gvisor/pkg/sentry/fs/fs.(Dirent).InotifyEvent(0xc000d1f200, 0x40000001) third_party/golang/gvisor/pkg/sentry/fs/dirent.go:1358 +0xaf google3/third_party/golang/gvisor/pkg/sentry/syscalls/linux/linux.getdents(0xc000902000, 0xc00000000e, 0x7fc1100008f0, 0x8000, 0xc66df0, 0x0, 0x0, 0x0) third_party/golang/gvisor/pkg/sentry/syscalls/linux/sys_getdents.go:86 +0x2e0 google3/third_party/golang/gvisor/pkg/sentry/syscalls/linux/linux.Getdents(0xc000902000, 0xe, 0x7fc1100008f0, 0x8000, 0x0, 0x3, 0x7fc110000070, 0x3, 0x7fc110000070, 0xc000000003, ...) third_party/golang/gvisor/pkg/sentry/syscalls/linux/sys_getdents.go:43 +0xe3 google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).executeSyscall(0xc000902000, 0x4e, 0xe, 0x7fc1100008f0, 0x8000, 0x0, 0x3, 0x7fc110000070, 0xe, 0xc3ae00, ...) third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:165 +0x30a google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).doSyscallInvoke(0xc000902000, 0x4e, 0xe, 0x7fc1100008f0, 0x8000, 0x0, 0x3, 0x7fc110000070, 0x3, 0x7fc110000070) third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:283 +0x69 google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).doSyscallEnter(0xc000902000, 0x4e, 0xe, 0x7fc1100008f0, 0x8000, 0x0, 0x3, 0x7fc110000070, 0xd1ee60, 0xc0014f1e58) third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:244 +0x99 google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).doSyscall(0xc000902000, 0x2, 0xc00057c000) third_party/golang/gvisor/pkg/sentry/kernel/task_syscall.go:219 +0x142 google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(runApp).execute(0x0, 0xc000902000, 0xd1ee60, 0x0) third_party/golang/gvisor/pkg/sentry/kernel/task_run.go:215 +0xfda google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).run(0xc000902000, 0x19) third_party/golang/gvisor/pkg/sentry/kernel/task_run.go:91 +0x149 created by google3/third_party/golang/gvisor/pkg/sentry/kernel/kernel.(Task).Start third_party/golang/gvisor/pkg/sentry/kernel/task_start.go:279 +0xfe
BTW, the pod hung as "Terminating" problem also affects runc with gVisor disabled. containerd has fixed some issues with stuck termination in more recent builds, so it's likely fixed. If I'm able to repro again I'll report it to containerd.
It should be working now. I'll close once the deadlock fix is submitted.
Same setup as in https://github.com/google/gvisor/issues/120 except this is a larger nodejs app, instead of nginx running in minikube + gvisor + containerd.
My pod is stuck "Terminating"
kubelet logs:
containerd logs:
the runsc process is still up and running:
crictl inspect pod hangs too:
if I kill the processes manually, the pod goes away (inside minikube):
the json logs don't contain anything interesting: