mmap_sem contention issues

bxatnarf commented 5 years ago

This seems to be related to/the root cause of https://github.com/ssrg-vt/popcorn-kernel/issues/63 Branch: next Platforms: Two x86 instances Binary: popcorn-kernel-lib's mt

mt does not seem to ever exit. This may be due to incorrect usage its task struct's mm.mmap_sem. After the process hangs (in this case pid 671), this semaphore looks as follows:

(gdb) print $lx_task_by_pid(671).mm.mmap_sem
$4 = {count = {counter = -8589934591}, wait_list = {next = 0xffffc90000263d90, prev = 0xffffc900003bbc50}, wait_lock = {raw_lock = {{val = {counter = 0}, {locked = 0 '\000', 
          pending = 0 '\000'}, {locked_pending = 0, tail = 0}}}}, osq = {tail = {counter = 0}}, owner = 0xffff880137e0b140}

bxatnarf commented 5 years ago

After enabling semaphore/lock debugging in the kernel, here is part of its log:

[  845.357480] ============================================
[  845.357614] WARNING: possible recursive locking detected
[  845.357803] 4.19.0-rc5-popcorn+ #2 Not tainted
[  845.357939] --------------------------------------------
[  845.358071] mt/694 is trying to acquire lock:
[  845.358209] 00000000f8a5f27c (&mm->mmap_sem){++++}, at: vm_munmap+0x5c/0xc0
[  845.358568] 
[  845.358568] but task is already holding lock:
[  845.358723] 00000000f8a5f27c (&mm->mmap_sem){++++}, at: vm_munmap+0xa5/0xc0
[  845.358932] 
[  845.358932] other info that might help us debug this:
[  845.359111]  Possible unsafe locking scenario:
[  845.359111] 
[  845.359251]        CPU0
[  845.359319]        ----
[  845.359386]   lock(&mm->mmap_sem);
[  845.359396]   lock(&mm->mmap_sem);
[  845.359396] 
[  845.359396]  *** DEADLOCK ***
[  845.359396] 
[  845.359396]  May be due to missing lock nesting notation
[  845.359396] 
[  845.359396] 1 lock held by mt/694:
[  845.359396]  #0: 00000000f8a5f27c (&mm->mmap_sem){++++}, at: vm_munmap+0xa5/0xc0
[  845.359396] 
[  845.359396] stack backtrace:
[  845.359396] CPU: 0 PID: 694 Comm: mt Not tainted 4.19.0-rc5-popcorn+ #2
[  845.359396] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20181104
[  845.359396] Call Trace:
[  845.359396]  dump_stack+0x67/0x90
[  845.359396]  __lock_acquire.cold.43+0x1d1/0x2f0
[  845.359396]  ? sched_clock_local+0x12/0x80
[  845.359396]  ? wait_for_common_io.constprop.2+0x12f/0x160
[  845.359396]  lock_acquire+0xb8/0x180
[  845.359396]  ? vm_munmap+0x5c/0xc0
[  845.359396]  down_write_killable+0x4d/0xd0
[  845.359396]  ? vm_munmap+0x5c/0xc0
[  845.359396]  vm_munmap+0x5c/0xc0
[  845.359396]  vma_server_munmap_origin+0x9b/0x110
[  845.359396]  __x64_sys_munmap+0x6f/0x80
[  845.359396]  do_syscall_64+0x69/0x400
[  845.359396]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  845.359396]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  845.359396] RIP: 0033:0x44408b
[  845.359396] Code: c7 c2 b8 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff e9 77 ff ff ff8
[  845.359396] RSP: 002b:00007fffffffe1b8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[  845.359396] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044408b
[  845.359396] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  845.359396] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  845.359396] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000004c20f0
[  845.359396] R13: 0000000002800000 R14: 0000000000000000 R15: 0000000000000000

bxatnarf commented 5 years ago

Here is more information regarding the lock contention identified in the above comment:

The addresses of the instructions in question:

(gdb) print/x vm_munmap+0x5c
$6 = 0xffffffff811b63fc
(gdb) print/x vm_munmap+0xa5
$7 = 0xffffffff811b6445
(gdb)

This corresponds to the following lines of code:

 > addr2line -e  vmlinux 0xffffffff811b63fc
/home/user/popcorn/linux-x86/mm/mmap.c:2835
> addr2line -e  vmlinux 0xffffffff811b6445
/home/user/popcorn/linux-x86/mm/mmap.c:2830

Which in my build is:

#ifdef CONFIG_POPCORN
    if (distributed_process(current)) {
        while (!down_write_trylock(&mm->mmap_sem))
            schedule();
    } 
#endif

    if (down_write_killable(&mm->mmap_sem))
        return -EINTR;

Looks like some down_writes got lost in the merge. (Argh.)

bxatnarf commented 5 years ago

Looks like part of this issue arose from changes made in upstream Linux's commit https://github.com/torvalds/linux/commit/dc0ef0df7b6a90892ec41933212ac701152a254c where instances of down_write were replaced by down_write_killable. I am not sure if this is the correct corse of action in our case, but I'll use down_write_killable.

bxatnarf commented 5 years ago

Fixed by https://github.com/ssrg-vt/popcorn-kernel/commit/071f622ec709327f28cae92487bedd833e17bf72

ssrg-vt / popcorn-kernel

mmap_sem contention issues #68