Open bo0ts opened 5 months ago
And this time I could actually capture a stacktrace of the problem:
Feb 14 07:33:13 myhost.intern kernel: sysrq: Show Blocked State
Feb 14 07:33:13 myhost.intern kernel: task:postgres state:D stack:0 pid:2581549 ppid:2581545 flags:0x00000006
Feb 14 07:33:13 myhost.intern kernel: Call Trace:
Feb 14 07:33:13 myhost.intern kernel: <TASK>
Feb 14 07:33:13 myhost.intern kernel: __schedule+0x35f/0x1360
Feb 14 07:33:13 myhost.intern kernel: ? avc_has_perm_noaudit+0x8b/0xf0
Feb 14 07:33:13 myhost.intern kernel: ? preempt_count_add+0x6a/0xa0
Feb 14 07:33:13 myhost.intern kernel: ? nfs_do_lookup_revalidate+0x260/0x260 [nfs]
Feb 14 07:33:13 myhost.intern kernel: schedule+0x5d/0xe0
Feb 14 07:33:13 myhost.intern kernel: __nfs_lookup_revalidate+0xdd/0x120 [nfs]
Feb 14 07:33:13 myhost.intern kernel: ? psi_poll_worker+0x4f0/0x4f0
Feb 14 07:33:13 myhost.intern kernel: lookup_fast+0x74/0xe0
Feb 14 07:33:13 myhost.intern kernel: path_openat+0xf6/0x1230
Feb 14 07:33:13 myhost.intern kernel: do_filp_open+0x9e/0x130
Feb 14 07:33:13 myhost.intern kernel: do_sys_openat2+0x96/0x150
Feb 14 07:33:13 myhost.intern kernel: __x64_sys_openat+0x5c/0x80
Feb 14 07:33:13 myhost.intern kernel: do_syscall_64+0x5b/0x80
Feb 14 07:33:13 myhost.intern kernel: ? fpregs_restore_userregs+0x56/0xe0
Feb 14 07:33:13 myhost.intern kernel: ? exit_to_user_mode_prepare+0x18f/0x1f0
Feb 14 07:33:13 myhost.intern kernel: ? syscall_exit_to_user_mode+0x17/0x40
Feb 14 07:33:13 myhost.intern kernel: ? do_syscall_64+0x67/0x80
Feb 14 07:33:13 myhost.intern kernel: ? syscall_exit_to_user_mode_prepare+0x18e/0x1c0
Feb 14 07:33:13 myhost.intern kernel: ? syscall_exit_to_user_mode+0x17/0x40
Feb 14 07:33:13 myhost.intern kernel: ? do_syscall_64+0x67/0x80
Feb 14 07:33:13 myhost.intern kernel: ? do_syscall_64+0x67/0x80
Feb 14 07:33:13 myhost.intern kernel: ? do_syscall_64+0x67/0x80
Feb 14 07:33:13 myhost.intern kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
It could get into such state in couple of places, where it:
By looking at stacktrace I can see a call nfs_do_lookup_revalidate
, which makes me think that you are dealing with the second case. Or in other words, you keep $PGDATA on NFS and it becomes unresponsive.
Yes, we hit the second case. The odd thing is that no other application has trouble with the NFS during that time and it only happens during pod termination. There are no visible issues while the container is running.
it only happens during pod termination
It looks like a race condition. That is, K8s for some reason decides to unmount NFS before Pod/Container are fully stopped.
We are looking at an issue that some of our postgres instances running with the Zalando Operator (Version 1.10.1, PVCs on NFS, Spilo Image
spilo-15:3.0-p1
), sometimes get stuck during restart. The instances cannot be killed withpg_ctl -m immediate stop
and when usingpg_ctl kill KILL
, usually a single process stuck in uninterruptible sleep remains:Unfortunately I missed capturing the stack of that process before finally rebooting the machine (will certainly do it next time), but I would like to start a discussion here if there are any known issues around this or if you have any suspicions what could cause this.