linux-audit / audit-kernel

GitHub mirror of the Linux Kernel's audit repository
https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git
Other
137 stars 36 forks source link

Q/BUG: process stuck in audit_backlog_wait for a longtime #144

Open tuxoko opened 1 year ago

tuxoko commented 1 year ago

System: centos 7 Kernel: Upstream 5.10.168 with some custom change unrelated to audit auditd: 2.8.5-4.el7.x86_64

Hi, we had an issue where we were trying to kill a process to be able to umount some filesystem. However the process seemed to stuck in audit_backlog_wait for more than 20 seconds. We have some code triggered a panic because it couldn't umount the filesystem.

Here's the stack trace

PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"
 #0 [ffffa6218ab6bc68] __schedule at ffffffff95a00af6
 #1 [ffffa6218ab6bcf8] schedule at ffffffff95a00f3f
 #2 [ffffa6218ab6bd18] schedule_timeout at ffffffff95a049bf
 #3 [ffffa6218ab6bd98] audit_log_start at ffffffff9516e338
 #4 [ffffa6218ab6be20] audit_log_start at ffffffff9516e5af
 #5 [ffffa6218ab6be48] audit_log_name.constprop.0 at ffffffff95174b00
 #6 [ffffa6218ab6be98] audit_log_exit at ffffffff951751db
 #7 [ffffa6218ab6bee0] __audit_syscall_exit at ffffffff95176ba0
 #8 [ffffa6218ab6bf18] syscall_exit_to_user_mode at ffffffff959f7c62
 #9 [ffffa6218ab6bf38] do_syscall_64 at ffffffff959f3ae5
#10 [ffffa6218ab6bf50] entry_SYSCALL_64_after_hwframe at ffffffff95c000a9
crash> ps -m | grep rsync
[0 00:00:21.270] [UN]  PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"

The audit_queue seems quite empty at the time of panic

crash> p audit_queue.qlen
$5 = 4
crash> p audit_backlog_limit
audit_backlog_limit = $6 = 640

I think there were some issue with fairness at play here. When processes enter audit_log_start and audit_queue.qlen is large, it will then decides to wait. Then while kauditd is consuming the audit_queue, other threads entering audit_log_start might see audit_queue.qlen small and bypass the wait. So there's no guarantee when the process in the wait will be able to queue.

Another part of this issue is that kauditd will only wake up one process in each iteration when it process the whole queue. The comment says wake everyone but it uses wake_up not wake_up_all even though waiter uses add_wait_queue_exclusive. If the intention is wake everyone then should we change it to wake_up_all? I think if it is wake_up_all then the chances of our process stuck for 20 seconds would probably be lower.

static int kauditd_thread(void *dummy)
{
...
        /* we have processed all the queues so wake everyone */
        wake_up(&audit_backlog_wait);
rgbriggs commented 1 year ago

On 2023-06-15 14:59, Chunwei Chen wrote:

System: centos 7 Kernel: Upstream 5.10.168 with some custom change unrelated to audit auditd: 2.8.5-4.el7.x86_64

That's pretty old...

Hi, we had an issue where we were trying to kill a process to be able to umount some filesystem.

"umount some filesystem"... Would that happen to be a user or network filesystem? https://github.com/linux-audit/audit-kernel/issues/100

However the process seemed to stuck in audit_backlog_wait for more than 20 seconds. We have some code triggered a panic because it couldn't umount the filesystem.

tuxoko commented 1 year ago

@rgbriggs It's a local filesystem and it's not fuse. We know the the process stuck in audit is holding fd in the filesystem because it's shown in the vmcore dump.

pcmoore commented 1 year ago

Unfortunately it's hard to draw any conclusions using a stacktrace from a custom kernel. Are you able to reproduce the problem on a modern kernel, or at the very least one without any custom patches?

tuxoko commented 1 year ago

@pcmoore I can try that, but there's really not much difference and nothing related to audit. And I have the vmcore so if you think there's something to look there I can look it up. Also I'm curious about the intention of the wake_up in kauditd_thread was it to wake everyone or only one at a time.

pcmoore commented 1 year ago

@pcmoore I can try that, but there's really not much difference and nothing related to audit.

Sometimes unrelated changes can have a surprising impact :) For that reason, as a general rule we don't provide any upstream support for Linux Kernels with custom patches.

Beyond the custom kernel changes, I'm also curious if a modern kernel resolves your problem as there have been some changes related to queue management since v5.10 was released. As Richard already mentioned, that kernel is quite old.

And I have the vmcore so if you think there's something to look there I can look it up.

At the very least the line numbers of the stacktrace would be helpful.

Also I'm curious about the intention of the wake_up in kauditd_thread was it to wake everyone or only one at a time.

There was some discussion about waking blocked processes recently on the audit mailing list:

tuxoko commented 1 year ago
crash> bt -l 188152
PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"
 #0 [ffffa6218ab6bc68] __schedule at ffffffff95a00af6
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/sched/core.c: 3797
 #1 [ffffa6218ab6bcf8] schedule at ffffffff95a00f3f
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/include/asm/bitops.h: 206
 #2 [ffffa6218ab6bd18] schedule_timeout at ffffffff95a049bf
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/time/timer.c: 1878
 #3 [ffffa6218ab6bd98] audit_log_start at ffffffff9516e338
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/audit.c: 1886
 #4 [ffffa6218ab6be20] audit_log_start at ffffffff9516e5af
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/audit.c: 1856
 #5 [ffffa6218ab6be48] audit_log_name.constprop.0 at ffffffff95174b00
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1404
 #6 [ffffa6218ab6be98] audit_log_exit at ffffffff951751db
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1644
 #7 [ffffa6218ab6bee0] __audit_syscall_exit at ffffffff95176ba0
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1794
 #8 [ffffa6218ab6bf18] syscall_exit_to_user_mode at ffffffff959f7c62
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/include/linux/audit.h: 340
 #9 [ffffa6218ab6bf38] do_syscall_64 at ffffffff959f3ae5
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/entry/common.c: 56
#10 [ffffa6218ab6bf50] entry_SYSCALL_64_after_hwframe at ffffffff95c000a9
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/entry/entry_64.S: 132
pcmoore commented 1 year ago

Ah, I see this is a Nutanix kernel build. Do you work for Nutanix or are you a customer/user? If the former, you should definitely read the mail archive link I posted above.

tuxoko commented 1 year ago

Yeah, I noticed just now. I work for different team than Eiichi and we use different build, but it does seem very similar.