ssrg-vt / popcorn-kernel

Popcorn Linux kernel for distributed thread execution
Other
156 stars 22 forks source link

popcorn-kernel-lib's mt example does not sucessfully execute #63

Closed bxatnarf closed 5 years ago

bxatnarf commented 5 years ago

mt will run for a couple of minutes then fail (get killed) when executing on x86.

The space below is notes I've taken to help me pinpoint the issue.

The source code for mt can be found here, it looks like:

void *child(void *arg)
{
    struct child_param *param = (struct child_param *)arg;
    int tid = popcorn_gettid();
    int i;

    for (i = 0; i < LOOPS; i++) {
        printf("Entering %d %d\n", param->tid, tid);
        pthread_barrier_wait(&barrier_start);
        printf("Entered %d %d\n", param->tid, tid);
        migrate(1, NULL, NULL);
        loop();
        migrate(0, NULL, NULL);
        pthread_barrier_wait(&barrier_end);
        printf("Exited %d %d\n", param->tid, tid);
        pthread_barrier_wait(&barrier_loop);
    }

    return 0;
}

int main(int argc, char *argv[])
{
    struct child_param threads[THREADS];
    int i;

    pthread_barrier_init(&barrier_start, NULL, THREADS + 1);
    pthread_barrier_init(&barrier_end, NULL, THREADS + 1);

    for (i = 0; i < THREADS; i++) {
        threads[i].tid = i;
        threads[i].ret = 0;
        threads[i].stop = 0;
        pthread_create(
                &threads[i].thread_info, NULL, &child, threads + i);
    }

    for (i = 0; i < LOOPS; i++) {
        pthread_barrier_wait(&barrier_start);
        pthread_barrier_init(&barrier_loop, NULL, THREADS + 1);
        pthread_barrier_wait(&barrier_end);

        pthread_barrier_init(&barrier_start, NULL, THREADS + 1);
        pthread_barrier_wait(&barrier_loop);
        pthread_barrier_init(&barrier_end, NULL, THREADS + 1);
    }

    for (i = 0; i < THREADS; i++) {
        pthread_join(threads[i].thread_info, (void **)&(threads[i].ret));
        printf("Exited %d %d with %d\n", i, threads[i].tid, threads[i].ret);
    }
    return 0;
}

This is the log on the origin. It is possible that it doesn't capture the earliest failure.

[  329.616776] ## PAGEFAULT [640] 7ffff77f8000 W 445318 55 8000000000000065
[  329.617134]  =[640] 7ffff77f8000 replicated mine 00000000c8298bcb
[  329.617390]   [640] revoke 7ffff77f8000 [638/1]
[  329.621182] >>[640] 7ffff77f8000 00000000c8298bcb
[  329.623611] EXITED [662] local / 0x0
[  329.624680] EXITED [659] local / 0x0
[  329.625122] EXITED [668] local / 0x0
[  329.625462] EXITED [660] local / 0x0
[  329.625796] EXITED [650] local / 0x0
[  329.626098] EXITED [665] local / 0x0
[  329.626379] EXITED [651] local / 0x0
[  329.626680] EXITED [652] local / 0x0
[  329.626953] EXITED [654] local / 0x0
[  329.627234] EXITED [646] local / 0x0
[  329.627525] EXITED [641] local / 0x0
[  329.628486] EXITED [657] local / 0x0
[  329.629037] >>[638] 7ffff7ff9000 00000000c8298bcb
[  329.629302] EXITED [664] local / 0x0
[  329.629630] EXITED [663] local / 0x0
[  329.629924] EXITED [666] local / 0x0
[  329.630197] EXITED [655] local / 0x0
[  329.630469] EXITED [656] local / 0x0
[  329.630742] EXITED [658] local / 0x0
[  329.631298] EXITED [642] local / 0x0
[  329.632372] EXITED [649] local / 0x0
[  329.634274] EXITED [653] local / 0x0
[  329.634329] EXITED [648] local / 0x0
[  329.636278] EXITED [667] local / 0x0
[  329.637746] EXITED [643] local / 0x0
[  329.638311] EXITED [669] local / 0x0
[  329.638776] EXITED [670] local / 0x0
[  329.639290] EXITED [640] local / 0x0
[  329.639698] EXITED [661] local / 0x0
[  329.642131] EXITED [644] local / 0x0
[  329.642788] EXITED [647] local / 0x0
[  329.643333] EXITED [639] local / 0x0
[  329.645145] EXITED [645] local / 0x0
[  329.656558]   [638] ->munmap [638/1] 7ffff77f9000+801000
[  485.294100] INFO: task mt:638 blocked for more than 120 seconds.
[  485.294495]       Not tainted 4.19.0-rc5-popcorn+ #217
[  485.294698] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  485.295102] mt              D    0   638    610 0x00000000
[  485.295494] Call Trace:
[  485.296472]  ? __schedule+0x1e7/0x790
[  485.296657]  ? __switch_to_asm+0x40/0x70
[  485.296848]  schedule+0x28/0x80
[  485.297032]  schedule_timeout+0x1c2/0x460
[  485.297227]  ? collect_expired_timers+0xa0/0xa0
[  485.297520]  io_schedule_timeout+0x19/0x40
[  485.297693]  wait_for_common_io.constprop.2+0xa0/0x130
[  485.297880]  ? wake_up_q+0x70/0x70
[  485.297880]  wait_at_station+0x39/0x60
[  485.297931]  vma_server_munmap_origin+0x11a/0x130
[  485.298125]  __x64_sys_munmap+0x6f/0x80
[  485.298283]  do_syscall_64+0x64/0x310
[  485.298470]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  485.298938] RIP: 0033:0x44407b
[  485.299186] Code: Bad RIP value.
[  485.299373] RSP: 002b:00007fffffffe168 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[  485.299732] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044407b
[  485.300040] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  485.300361] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  485.300662] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004c20f0
[  485.300949] R13: 0000000002800000 R14: 0000000000000000 R15: 00000000004c9dc0
[  606.133603] INFO: task mt:638 blocked for more than 120 seconds.
[  606.133886]       Not tainted 4.19.0-rc5-popcorn+ #217
[  606.134105] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  606.134428] mt              D    0   638    610 0x00000000
[  606.134688] Call Trace:
[  606.134843]  ? __schedule+0x1e7/0x790
[  606.135020]  ? __switch_to_asm+0x40/0x70
[  606.135197]  schedule+0x28/0x80
[  606.135340]  schedule_timeout+0x1c2/0x460
[  606.135560]  ? collect_expired_timers+0xa0/0xa0
[  606.135770]  io_schedule_timeout+0x19/0x40
[  606.135948]  wait_for_common_io.constprop.2+0xa0/0x130
[  606.136151]  ? wake_up_q+0x70/0x70
[  606.136301]  wait_at_station+0x39/0x60
[  606.136468]  vma_server_munmap_origin+0x11a/0x130
[  606.136663]  __x64_sys_munmap+0x6f/0x80
[  606.136835]  do_syscall_64+0x64/0x310
[  606.136987]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  606.137309] RIP: 0033:0x44407b
[  606.137500] Code: Bad RIP value.
[  606.137651] RSP: 002b:00007fffffffe168 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[  606.137971] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044407b
[  606.138249] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  606.138550] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  606.138826] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004c20f0
[  606.139118] R13: 0000000002800000 R14: 0000000000000000 R15: 00000000004c9dc0
[  645.048627] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  645.049292] PGD 13afef067 P4D 13afef067 PUD 1381d7067 PMD 0 
[  645.049819] Oops: 0000 [#1] SMP NOPTI
[  645.050149] CPU: 0 PID: 638 Comm: mt Not tainted 4.19.0-rc5-popcorn+ #217
[  645.050641] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20181104
[  645.051302] RIP: 0010:___cache_free+0x19/0x1b0
[  645.051562] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[  645.051864] RSP: 0018:ffffc900005b7e08 EFLAGS: 00000092
[  645.051864] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[  645.051864] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[  645.051864] RBP: ffffffffffffff92 R08: 000000962fd99000 R09: 0000000000000000
[  645.051864] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[  645.051864] R13: ffff88013824bc00 R14: 0000000000000001 R15: 0000000000000000
[  645.051864] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[  645.051864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  645.051864] CR2: 0000000000000000 CR3: 000000013a4e6000 CR4: 00000000000006f0
[  645.051864] Call Trace:
[  645.051864]  ? wait_for_common_io.constprop.2+0xab/0x130
[  645.051864]  ? vma_server_munmap_origin+0x122/0x130
[  645.051864]  kfree+0x7a/0xd0
[  645.051864]  vma_server_munmap_origin+0x122/0x130
[  645.051864]  __x64_sys_munmap+0x6f/0x80
[  645.051864]  do_syscall_64+0x64/0x310
[  645.051864]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  645.051864] RIP: 0033:0x44407b
[  645.051864] Code: c7 c2 b8 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff e9 77 ff ff ff8
[  645.051864] RSP: 002b:00007fffffffe168 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[  645.051864] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044407b
[  645.051864] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  645.051864] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  645.051864] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004c20f0
[  645.051864] R13: 0000000002800000 R14: 0000000000000000 R15: 00000000004c9dc0
[  645.051864] Modules linked in: msg_socket
[  645.051864] CR2: 0000000000000000
[  645.051864] ---[ end trace 4c532a25e50a6f64 ]---
[  645.051864] RIP: 0010:___cache_free+0x19/0x1b0
[  645.051864] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[  645.051864] RSP: 0018:ffffc900005b7e08 EFLAGS: 00000092
[  645.051864] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[  645.051864] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[  645.051864] RBP: ffffffffffffff92 R08: 000000962fd99000 R09: 0000000000000000
[  645.051864] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[  645.051864] R13: ffff88013824bc00 R14: 0000000000000001 R15: 0000000000000000
[  645.051864] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[  645.051864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  645.051864] CR2: 0000000000000000 CR3: 000000013a4e6000 CR4: 00000000000006f0
[  645.057102] EXITED [638] local / 0x9
[  645.057402] TERMINATE [638/1] with 0x9

It is tricky to determine which parts of the target host's log correspond with the origin's as it doesn't indicate any errors, but here is an example of the target's log at about the same time until the end of execution.

  326.770397] ####### MIGRATE [3814] to 0
[  326.770574] EXITED [3814] remote / 0x40
[  326.773818] 
[  326.773818] INVALIDATE_PAGE [638] 7ffff7ff8000 [639/0]
[  326.774118]  =[638] 7ffff7ff8000
[  326.774289] >>[638] ->[639/0]
[  326.781416] 
[  326.781416] REMOTE_PAGE_REQUEST [638] 7fffec7e1000 R 401ffb from [662/0]
[  326.782385] >>[638] 7fffec7e1000 00000000ba9b1640
[  326.782594]   [638] ->[662/0] 0
[  326.783344] 
[  326.783344] INVALIDATE_PAGE [638] 7ffff77f7000 [640/0]
[  326.783648]  =[638] 7ffff77f7000
[  326.783796] >>[638] ->[640/0]
[  326.785339] 
[  326.785339] REMOTE_PAGE_REQUEST [638] 7ffff7ff9000 R 4020ee from [639/0]
[  326.787965] >>[638] 7ffff7ff9000 00000000ba9b1640
[  326.788176]   [638] ->[639/0] 0
[  326.789542] ####### MIGRATE [3822] to 0
[  326.789865] EXITED [3822] remote / 0x40
[  326.790242] ####### MIGRATE [3828] to 0
[  326.790431] EXITED [3828] remote / 0x40
[  326.794320] 
[  326.794320] REMOTE_PAGE_REQUEST [638] 7ffff57f3000 R 401ffb from [644/0]
[  326.794708] >>[638] 7ffff57f3000 00000000ba9b1640
[  326.794902]   [638] ->[644/0] 0
[  326.795074] 
[  326.795074] REMOTE_PAGE_REQUEST [638] 7ffff07e9000 R 401ffb from [654/0]
[  326.795407] >>[638] 7ffff07e9000 00000000ba9b1640
[  326.795595]   [638] ->[654/0] 0
[  326.795738] 
[  326.795738] REMOTE_PAGE_REQUEST [638] 7fffe2fde000 R 401ffb from [665/0]
[  326.796064] >>[638] 7fffe2fde000 00000000ba9b1640
[  326.796246]   [638] ->[665/0] 0
[  326.805389] 
[  326.805389] REMOTE_PAGE_REQUEST [638] 7ffff4ff2000 R 401ffb from [645/0]
[  326.805783] >>[638] 7ffff4ff2000 00000000ba9b1640
[  326.805999]   [638] ->[645/0] 0
[  326.806174] 
[  326.806174] REMOTE_PAGE_REQUEST [638] 7ffff3ff0000 R 401ffb from [647/0]
[  326.806558] >>[638] 7ffff3ff0000 00000000ba9b1640
[  326.806719]   [638] ->[647/0] 0
[  326.806860] 
[  326.806860] REMOTE_PAGE_REQUEST [638] 7fffedfe4000 R 401ffb from [659/0]
[  326.807206] >>[638] 7fffedfe4000 00000000ba9b1640
[  326.807419]   [638] ->[659/0] 0
[  326.807571] 
[  326.807571] REMOTE_PAGE_REQUEST [638] 7ffff5ff4000 R 401ffb from [643/0]
[  326.809246] >>[638] 7ffff5ff4000 00000000ba9b1640
[  326.812039] ####### MIGRATE [3817] to 0
[  326.812297] EXITED [3817] remote / 0x40
[  326.814228] ####### MIGRATE [3810] to 0
[  326.814473] EXITED [3810] remote / 0x40
[  326.814740] ####### MIGRATE [3836] to 0
[  326.814912] EXITED [3836] remote / 0x40
[  326.815131] ####### MIGRATE [3833] to 0
[  326.815297] EXITED [3833] remote / 0x40
[  326.815582]   [638] ->[643/0] 0
[  326.815785] 
[  326.815785] REMOTE_PAGE_REQUEST [638] 7fffea7dd000 R 401ffb from [669/0]
[  326.816161] >>[638] 7fffea7dd000 00000000ba9b1640
[  326.816364]   [638] ->[669/0] 0
[  326.816625] 
[  326.816625] REMOTE_PAGE_REQUEST [638] 7fffe27dd000 R 401ffb from [666/0]
[  326.817603] >>[638] 7fffe27dd000 00000000ba9b1640
[  326.817856] ####### MIGRATE [3838] to 0
[  326.818109] EXITED [3838] remote / 0x40
[  326.818421] ####### MIGRATE [3815] to 0
[  326.818598] EXITED [3815] remote / 0x40
[  326.818818] ####### MIGRATE [3826] to 0
[  326.819047] EXITED [3826] remote / 0x40
[  326.821323] ####### MIGRATE [3832] to 0
[  326.821366] ####### MIGRATE [3837] to 0
[  326.821920] EXITED [3832] remote / 0x40
[  326.823023]   [638] ->[666/0] 0
[  326.825518] 
[  326.825518] INVALIDATE_PAGE [638] 7fffec7e1000 [662/0]
[  326.825518] EXITED [3837] remote / 0x40
[  326.829314]  =[638] 7fffec7e1000
[  326.833318] ####### MIGRATE [3820] to 0
[  326.833626] EXITED [3820] remote / 0x40
[  326.833968] ####### MIGRATE [3834] to 0
[  326.834160] EXITED [3834] remote / 0x40
[  326.836570] >>[638] ->[662/0]
[  326.836823] 
[  326.836823] REMOTE_PAGE_REQUEST [638] 7ffff47f1000 R 401ffb from [646/0]
[  326.837294] >>[638] 7ffff47f1000 00000000ba9b1640
[  326.837344] ####### MIGRATE [3824] to 0
[  326.837824]   [638] ->[646/0] 0
[  326.838321] 
[  326.838321] REMOTE_PAGE_REQUEST [638] 7ffff6ff6000 R 401ffb from [641/0]
[  326.838965] EXITED [3824] remote / 0x40
[  326.839336] ####### MIGRATE [3816] to 0
[  326.839528] EXITED [3816] remote / 0x40
[  326.841268] >>[638] 7ffff6ff6000 00000000ba9b1640
[  326.843903] ####### MIGRATE [3831] to 0
[  326.844216] EXITED [3831] remote / 0x40
[  326.849427]   [638] ->[641/0] 0
[  326.849890] 
[  326.849890] INVALIDATE_PAGE [638] 7ffff57f3000 [644/0]
[  326.850335]  =[638] 7ffff57f3000
[  326.850471] >>[638] ->[644/0]
[  326.850615] 
[  326.850615] INVALIDATE_PAGE [638] 7ffff07e9000 [654/0]
[  326.850860]  =[638] 7ffff07e9000
[  326.850988] >>[638] ->[654/0]
[  326.851113] 
[  326.851113] INVALIDATE_PAGE [638] 7fffe2fde000 [665/0]
[  326.851355]  =[638] 7fffe2fde000
[  326.851479] >>[638] ->[665/0]
[  326.857863] 
[  326.857863] INVALIDATE_PAGE [638] 7ffff4ff2000 [645/0]
[  326.858095]  =[638] 7ffff4ff2000
[  326.858185] >>[638] ->[645/0]
[  326.858279] 
[  326.858279] INVALIDATE_PAGE [638] 7ffff3ff0000 [647/0]
[  326.858526]  =[638] 7ffff3ff0000
[  326.858668] >>[638] ->[647/0]
[  326.858795] 
[  326.858795] INVALIDATE_PAGE [638] 7fffedfe4000 [659/0]
[  326.859040]  =[638] 7fffedfe4000
[  326.859169] >>[638] ->[659/0]
[  326.859301] 
[  326.859301] REMOTE_PAGE_REQUEST [638] 7fffecfe2000 R 401ffb from [661/0]
[  326.859644] >>[638] 7fffecfe2000 00000000ba9b1640
[  326.859830]   [638] ->[661/0] 0
[  326.859962] 
[  326.859962] REMOTE_PAGE_REQUEST [638] 7ffff0fea000 R 401ffb from [653/0]
[  326.860277] >>[638] 7ffff0fea000 00000000ba9b1640
[  326.860453]   [638] ->[653/0] 0
[  326.860602] 
[  326.860602] REMOTE_PAGE_REQUEST [638] 7ffff1fec000 R 401ffb from [651/0]
[  326.860912] >>[638] 7ffff1fec000 00000000ba9b1640
[  326.861267]   [638] ->[651/0] 0
[  326.861432] 
[  326.861432] INVALIDATE_PAGE [638] 7ffff5ff4000 [643/0]
[  326.861838]  =[638] 7ffff5ff4000
[  326.861971] >>[638] ->[643/0]
[  326.862112] 
[  326.862112] INVALIDATE_PAGE [638] 7fffea7dd000 [669/0]
[  326.862297]  =[638] 7fffea7dd000
[  326.862387] >>[638] ->[669/0]
[  326.862480] 
[  326.862480] REMOTE_PAGE_REQUEST [638] 7fffee7e5000 R 401ffb from [658/0]
[  326.862788] >>[638] 7fffee7e5000 00000000ba9b1640
[  326.862976]   [638] ->[658/0] 0
[  326.863130] 
[  326.863130] REMOTE_PAGE_REQUEST [638] 7fffebfe0000 R 401ffb from [663/0]
[  326.863455] >>[638] 7fffebfe0000 00000000ba9b1640
[  326.863647]   [638] ->[663/0] 0
[  326.863787] 
[  326.863787] REMOTE_PAGE_REQUEST [638] 7ffff2fee000 R 401ffb from [649/0]
[  326.864123] >>[638] 7ffff2fee000 00000000ba9b1640
[  326.864310]   [638] ->[649/0] 0
[  326.864426] 
[  326.864426] INVALIDATE_PAGE [638] 7fffe27dd000 [666/0]
[  326.865226]  =[638] 7fffe27dd000
[  326.865367] >>[638] ->[666/0]
[  326.865542] 
[  326.865542] REMOTE_PAGE_REQUEST [638] 7ffff77f8000 R 4020ee from [640/0]
[  326.865872] >>[638] 7ffff77f8000 00000000ba9b1640
[  326.866067]   [638] ->[640/0] 0
[  326.873429] 
[  326.873429] REMOTE_PAGE_REQUEST [638] 7fffec7e2000 R 4020ee from [662/0]
[  326.877268] >>[638] 7fffec7e2000 00000000ba9b1640
[  326.877446] ####### MIGRATE [3830] to 0
[  326.877891]   [638] ->[662/0] 0
[  326.877945] 
[  326.877945] REMOTE_PAGE_REQUEST [638] 7ffff37ef000 R 401ffb from [648/0]
[  326.878456] EXITED [3830] remote / 0x40
[  326.878786] ####### MIGRATE [3829] to 0
[  326.878964] EXITED [3829] remote / 0x40
[  326.880309] >>[638] 7ffff37ef000 00000000ba9b1640
[  326.880528]   [638] ->[648/0] 0
[  326.880682] 
[  326.880682] REMOTE_PAGE_REQUEST [638] 7ffff67f5000 R 401ffb from [642/0]
[  326.881246] >>[638] 7ffff67f5000 00000000ba9b1640
[  326.882578]   [638] ->[642/0] 0
[  326.882798] 
[  326.882798] INVALIDATE_PAGE [638] 7ffff6ff6000 [641/0]
[  326.883074]  =[638] 7ffff6ff6000
[  326.883218] >>[638] ->[641/0]
[  326.883321] 
[  326.883321] REMOTE_PAGE_REQUEST [638] 7fffe37df000 R 401ffb from [664/0]
[  326.883584] >>[638] 7fffe37df000 00000000ba9b1640
[  326.883782]   [638] ->[664/0] 0
[  326.883922] 
[  326.883922] INVALIDATE_PAGE [638] 7ffff47f1000 [646/0]
[  326.884204]  =[638] 7ffff47f1000
[  326.884341] >>[638] ->[646/0]
[  326.886139] 
[  326.886139] REMOTE_PAGE_REQUEST [638] 7fffeffe8000 R 401ffb from [655/0]
[  326.886441] >>[638] 7fffeffe8000 00000000ba9b1640
[  326.886629]   [638] ->[655/0] 0
[  326.886735] 
[  326.886735] REMOTE_PAGE_REQUEST [638] 7fffeefe6000 R 401ffb from [657/0]
[  326.887064] >>[638] 7fffeefe6000 00000000ba9b1640
[  326.887248]   [638] ->[657/0] 0
[  326.887380] 
[  326.887380] REMOTE_PAGE_REQUEST [638] 7fffeafde000 R 401ffb from [668/0]
[  326.887689] >>[638] 7fffeafde000 00000000ba9b1640
[  326.887864]   [638] ->[668/0] 0
[  326.901575] 
[  326.901575] REMOTE_PAGE_REQUEST [638] 7ffff4ff3000 R 4020ee from [645/0]
[  326.901710] >>[638] 7ffff4ff3000 00000000ba9b1640
[  326.901832]   [638] ->[645/0] 0
[  326.917489] 
[  326.917489] REMOTE_PAGE_REQUEST [638] 7fffe2fdf000 R 4020ee from [665/0]
[  326.917753] >>[638] 7fffe2fdf000 00000000ba9b1640
[  326.917871]   [638] ->[665/0] 0
[  326.917989] 
[  326.917989] REMOTE_PAGE_REQUEST [638] 7ffff3ff1000 R 4020ee from [647/0]
[  326.918186] >>[638] 7ffff3ff1000 00000000ba9b1640
[  326.918294]   [638] ->[647/0] 0
[  326.918402] 
[  326.918402] REMOTE_PAGE_REQUEST [638] 7fffed7e3000 R 401ffb from [660/0]
[  326.918598] >>[638] 7fffed7e3000 00000000ba9b1640
[  326.918706]   [638] ->[660/0] 0
[  326.918793] 
[  326.918793] REMOTE_PAGE_REQUEST [638] 7fffedfe5000 R 4020ee from [659/0]
[  326.918983] >>[638] 7fffedfe5000 00000000ba9b1640
[  326.919090]   [638] ->[659/0] 0
[  326.919178] 
[  326.919178] REMOTE_PAGE_REQUEST [638] 7ffff57f4000 R 4020ee from [644/0]
[  326.919365] >>[638] 7ffff57f4000 00000000ba9b1640
[  326.919489]   [638] ->[644/0] 0
[  326.919571] 
[  326.919571] REMOTE_PAGE_REQUEST [638] 7ffff27ed000 R 401ffb from [650/0]
[  326.919756] >>[638] 7ffff27ed000 00000000ba9b1640
[  326.919861]   [638] ->[650/0] 0
[  326.919941] 
[  326.919941] REMOTE_PAGE_REQUEST [638] 7ffff07ea000 R 4020ee from [654/0]
[  326.921298] >>[638] 7ffff07ea000 00000000ba9b1640
[  326.925917]   [638] ->[654/0] 0
[  326.925335] ####### MIGRATE [3835] to 0
[  326.925704] EXITED [3835] remote / 0x40
[  326.941539] 
[  326.941539] INVALIDATE_PAGE [638] 7fffee7e5000 [658/0]
[  326.942078]  =[638] 7fffee7e5000
[  326.942501] >>[638] ->[658/0]
[  326.957728] 
[  326.957728] INVALIDATE_PAGE [638] 7fffebfe0000 [663/0]
[  326.958137]  =[638] 7fffebfe0000
[  326.958323] >>[638] ->[663/0]
[  326.958485] 
[  326.958485] INVALIDATE_PAGE [638] 7ffff2fee000 [649/0]
[  326.958757]  =[638] 7ffff2fee000
[  326.958929] >>[638] ->[649/0]
[  326.959086] 
[  326.959086] REMOTE_PAGE_REQUEST [638] 7ffff5ff5000 R 4020ee from [643/0]
[  326.959868] >>[638] 7ffff5ff5000 00000000ba9b1640
[  326.960257]   [638] ->[643/0] 0
[  326.960428] 
[  326.960428] INVALIDATE_PAGE [638] 7ffff0fea000 [653/0]
[  326.960729]  =[638] 7ffff0fea000
[  326.965875] >>[638] ->[653/0]
[  326.966592] 
[  326.966592] INVALIDATE_PAGE [638] 7ffff1fec000 [651/0]
[  326.966985]  =[638] 7ffff1fec000
[  326.967395] >>[638] ->[651/0]
[  326.967553] 
[  326.967553] REMOTE_PAGE_REQUEST [638] 7fffea7de000 R 4020ee from [669/0]
[  326.967994] >>[638] 7fffea7de000 00000000ba9b1640
[  326.968308]   [638] ->[669/0] 0
[  326.968466] 
[  326.968466] REMOTE_PAGE_REQUEST [638] 7ffff17eb000 R 401ffb from [652/0]
[  326.968830] >>[638] 7ffff17eb000 00000000ba9b1640
[  326.969035]   [638] ->[652/0] 0
[  326.969296] 
[  326.969296] REMOTE_PAGE_REQUEST [638] 7fffeb7df000 R 401ffb from [667/0]
[  326.969697] >>[638] 7fffeb7df000 00000000ba9b1640
[  326.969914]   [638] ->[667/0] 0
[  326.970025] 
[  326.970025] REMOTE_PAGE_REQUEST [638] 7fffe27de000 R 4020ee from [666/0]
[  326.970463] >>[638] 7fffe27de000 00000000ba9b1640
[  326.970668]   [638] ->[666/0] 0
[  326.970836] 
[  326.970836] INVALIDATE_PAGE [638] 7fffed7e3000 [660/0]
[  326.971189]  =[638] 7fffed7e3000
[  326.971378] >>[638] ->[660/0]
[  326.977611] 
[  326.977611] REMOTE_PAGE_REQUEST [638] 7fffee7e6000 R 4020ee from [658/0]
[  326.977631] >>[638] 7fffee7e6000 00000000ba9b1640
[  326.977839]   [638] ->[658/0] 0
[  326.977996] 
[  326.977996] REMOTE_PAGE_REQUEST [638] 7fffebfe1000 R 4020ee from [663/0]
[  326.977996] >>[638] 7fffebfe1000 00000000ba9b1640
[  326.981254]   [638] ->[663/0] 0
[  326.981394] 
[  326.981394] INVALIDATE_PAGE [638] 7ffff37ef000 [648/0]
[  326.981773]  =[638] 7ffff37ef000
[  326.981909] >>[638] ->[648/0]
[  326.982031] 
[  326.982031] REMOTE_PAGE_REQUEST [638] 7ffff2fef000 R 4020ee from [649/0]
[  326.982244] >>[638] 7ffff2fef000 00000000ba9b1640
[  326.982410]   [638] ->[649/0] 0
[  326.982598] 
[  326.982598] REMOTE_PAGE_REQUEST [638] 7fffe9fdc000 R 401ffb from [670/0]
[  326.983054] >>[638] 7fffe9fdc000 00000000ba9b1640
[  326.983246]   [638] ->[670/0] 0
[  326.989708] 
[  326.989708] INVALIDATE_PAGE [638] 7ffff67f5000 [642/0]
[  326.990008]  =[638] 7ffff67f5000
[  326.990119] >>[638] ->[642/0]
[  326.994129] 
[  326.994129] INVALIDATE_PAGE [638] 7fffe37df000 [664/0]
[  326.994494]  =[638] 7fffe37df000
[  326.994636] >>[638] ->[664/0]
[  326.997542] 
[  326.997542] INVALIDATE_PAGE [638] 7fffeffe8000 [655/0]
[  326.997756]  =[638] 7fffeffe8000
[  326.997899] >>[638] ->[655/0]
[  327.002600] 
[  327.002600] INVALIDATE_PAGE [638] 7fffeefe6000 [657/0]
[  327.002600]  =[638] 7fffeefe6000
[  327.002600] >>[638] ->[657/0]
[  327.003616] 
[  327.003616] INVALIDATE_PAGE [638] 7fffeafde000 [668/0]
[  327.003616]  =[638] 7fffeafde000
[  327.005770] >>[638] ->[668/0]
[  327.006697] 
[  327.006697] INVALIDATE_PAGE [638] 7ffff27ed000 [650/0]
[  327.006697]  =[638] 7ffff27ed000
[  327.006809] >>[638] ->[650/0]
[  327.017380] 
[  327.017380] REMOTE_PAGE_REQUEST [638] 7ffff47f2000 R 4020ee from [646/0]
[  327.017860] >>[638] 7ffff47f2000 00000000ba9b1640
[  327.017994]   [638] ->[646/0] 0
[  327.018113] 
[  327.018113] REMOTE_PAGE_REQUEST [638] 7ffff6ff7000 R 4020ee from [641/0]
[  327.018322] >>[638] 7ffff6ff7000 00000000ba9b1640
[  327.018441]   [638] ->[641/0] 0
[  327.018534] 
[  327.018534] REMOTE_PAGE_REQUEST [638] 7fffe37e0000 R 4020ee from [664/0]
[  327.018800] >>[638] 7fffe37e0000 00000000ba9b1640
[  327.018973]   [638] ->[664/0] 0
[  327.019098] 
[  327.019098] REMOTE_PAGE_REQUEST [638] 7ffff0feb000 R 4020ee from [653/0]
[  327.019400] >>[638] 7ffff0feb000 00000000ba9b1640
[  327.019572]   [638] ->[653/0] 0
[  327.019729] 
[  327.019729] REMOTE_PAGE_REQUEST [638] 7fffed7e4000 R 4020ee from [660/0]
[  327.020053] >>[638] 7fffed7e4000 00000000ba9b1640
[  327.020220]   [638] ->[660/0] 0
[  327.020365] 
[  327.020365] INVALIDATE_PAGE [638] 7ffff17eb000 [652/0]
[  327.020612]  =[638] 7ffff17eb000
[  327.020737] >>[638] ->[652/0]
[  327.025360] 
[  327.025360] REMOTE_PAGE_REQUEST [638] 7ffff27ee000 R 4020ee from [650/0]
[  327.026270] >>[638] 7ffff27ee000 00000000ba9b1640
[  327.026460]   [638] ->[650/0] 0
[  327.029346] 
[  327.029346] REMOTE_PAGE_REQUEST [638] 7fffeafdf000 R 4020ee from [668/0]
[  327.030272] >>[638] 7fffeafdf000 00000000ba9b1640
[  327.030859]   [638] ->[668/0] 0
[  327.034974] 
[  327.034974] REMOTE_PAGE_REQUEST [638] 7ffff37f0000 R 4020ee from [648/0]
[  327.037286] >>[638] 7ffff37f0000 00000000ba9b1640
[  327.037414]   [638] ->[648/0] 0
[  327.037697] 
[  327.037697] INVALIDATE_PAGE [638] 7fffeb7df000 [667/0]
[  327.038898]  =[638] 7fffeb7df000
[  327.039043] >>[638] ->[667/0]
[  327.039219] 
[  327.039219] REMOTE_PAGE_REQUEST [638] 7ffff1fed000 R 4020ee from [651/0]
[  327.039567] >>[638] 7ffff1fed000 00000000ba9b1640
[  327.039752]   [638] ->[651/0] 0
[  327.039880] 
[  327.039880] REMOTE_PAGE_REQUEST [638] 7ffff67f6000 R 4020ee from [642/0]
[  327.040208] >>[638] 7ffff67f6000 00000000ba9b1640
[  327.040378]   [638] ->[642/0] 0
[  327.045419] 
[  327.045419] REMOTE_PAGE_REQUEST [638] 7fffeefe7000 R 4020ee from [657/0]
[  327.046742] >>[638] 7fffeefe7000 00000000ba9b1640
[  327.047701]   [638] ->[657/0] 0
[  327.050360] 
[  327.050360] REMOTE_PAGE_REQUEST [638] 7fffeffe9000 R 4020ee from [655/0]
[  327.050360] >>[638] 7fffeffe9000 00000000ba9b1640
[  327.050360]   [638] ->[655/0] 0
[  327.051292] 
[  327.051292] INVALIDATE_PAGE [638] 7fffe9fdc000 [670/0]
[  327.051292]  =[638] 7fffe9fdc000
[  327.051292] >>[638] ->[670/0]
[  327.053273] 
[  327.053273] REMOTE_PAGE_REQUEST [638] 7ffff17ec000 R 4020ee from [652/0]
[  327.056507] >>[638] 7ffff17ec000 00000000ba9b1640
[  327.058880]   [638] ->[652/0] 0
[  327.059086] 
[  327.059086] INVALIDATE_PAGE [638] 7fffecfe2000 [661/0]
[  327.059342]  =[638] 7fffecfe2000
[  327.059493] >>[638] ->[661/0]
[  327.059625] 
[  327.059625] REMOTE_PAGE_REQUEST [638] 7fffeb7e0000 R 4020ee from [667/0]
[  327.059948] >>[638] 7fffeb7e0000 00000000ba9b1640
[  327.060130]   [638] ->[667/0] 0
[  327.065467] 
[  327.065467] REMOTE_PAGE_REQUEST [638] 7fffe9fdd000 R 4020ee from [670/0]
[  327.066019] >>[638] 7fffe9fdd000 00000000ba9b1640
[  327.066198]   [638] ->[670/0] 0
[  327.069361] 
[  327.069361] REMOTE_PAGE_REQUEST [638] 7fffecfe3000 R 4020ee from [661/0]
[  327.071270] >>[638] 7fffecfe3000 00000000ba9b1640
[  327.071842]   [638] ->[661/0] 0
[  327.093392] 
[  327.093392] INVALIDATE_PAGE [638] 4c2000 [647/0]
[  327.093653]  =[638] 4c2000
[  327.093790] >>[638] ->[647/0]
[  327.109455] 
[  327.109455] INVALIDATE_PAGE [638] 7ffff3ff1000 [647/0]
[  327.109718]  =[638] 7ffff3ff1000
[  327.109838] >>[638] ->[647/0]
[  327.137503] 
[  327.137503] INVALIDATE_PAGE [638] 7ffff7ff9000 [639/0]
[  327.137783]  =[638] 7ffff7ff9000
[  327.137951] >>[638] ->[639/0]
[  327.145524] 
[  327.145524] INVALIDATE_PAGE [638] 7ffff4ff3000 [645/0]
[  327.145822]  =[638] 7ffff4ff3000
[  327.145970] >>[638] ->[645/0]
[  327.150885] 
[  327.150885] INVALIDATE_PAGE [638] 7fffef7e8000 [656/0]
[  327.150885]  =[638] 7fffef7e8000
[  327.150885] >>[638] ->[656/0]
[  327.154990] 
[  327.154990] INVALIDATE_PAGE [638] 7fffe2fdf000 [665/0]
[  327.157277]  =[638] 7fffe2fdf000
[  327.157526] >>[638] ->[665/0]
[  327.162635] 
[  327.162635] INVALIDATE_PAGE [638] 7fffec7e2000 [662/0]
[  327.162635]  =[638] 7fffec7e2000
[  327.162635] >>[638] ->[662/0]
[  327.169451] 
[  327.169451] INVALIDATE_PAGE [638] 7fffedfe5000 [659/0]
[  327.169765]  =[638] 7fffedfe5000
[  327.169866] >>[638] ->[659/0]
[  327.174068] 
[  327.174068] INVALIDATE_PAGE [638] 7ffff57f4000 [644/0]
[  327.174099]  =[638] 7ffff57f4000
[  327.174240] >>[638] ->[644/0]
[  327.181422] 
[  327.181422] INVALIDATE_PAGE [638] 7ffff07ea000 [654/0]
[  327.181736]  =[638] 7ffff07ea000
[  327.181877] >>[638] ->[654/0]
[  327.189586] 
[  327.189586] INVALIDATE_PAGE [638] 7ffff5ff5000 [643/0]
[  327.189830]  =[638] 7ffff5ff5000
[  327.189943] >>[638] ->[643/0]
[  327.197500] 
[  327.197500] INVALIDATE_PAGE [638] 7ffff6ff7000 [641/0]
[  327.197736]  =[638] 7ffff6ff7000
[  327.197876] >>[638] ->[641/0]
[  327.205422] 
[  327.205422] INVALIDATE_PAGE [638] 7fffea7de000 [669/0]
[  327.205687]  =[638] 7fffea7de000
[  327.205831] >>[638] ->[669/0]
[  327.213498] 
[  327.213498] INVALIDATE_PAGE [638] 7ffff47f2000 [646/0]
[  327.213784]  =[638] 7ffff47f2000
[  327.214054] >>[638] ->[646/0]
[  327.221475] 
[  327.221475] INVALIDATE_PAGE [638] 7fffebfe1000 [663/0]
[  327.221785]  =[638] 7fffebfe1000
[  327.221918] >>[638] ->[663/0]
[  327.229528] 
[  327.229528] INVALIDATE_PAGE [638] 7ffff2fef000 [649/0]
[  327.229883]  =[638] 7ffff2fef000
[  327.230038] >>[638] ->[649/0]
[  327.241486] 
[  327.241486] INVALIDATE_PAGE [638] 7fffed7e4000 [660/0]
[  327.241774]  =[638] 7fffed7e4000
[  327.241965] >>[638] ->[660/0]
[  327.249578] 
[  327.249578] INVALIDATE_PAGE [638] 7fffe37e0000 [664/0]
[  327.249932]  =[638] 7fffe37e0000
[  327.250085] >>[638] ->[664/0]
[  327.255211] 
[  327.255211] INVALIDATE_PAGE [638] 7ffff27ee000 [650/0]
[  327.255211]  =[638] 7ffff27ee000
[  327.255211] >>[638] ->[650/0]
[  327.261498] 
[  327.261498] INVALIDATE_PAGE [638] 7fffeafdf000 [668/0]
[  327.261753]  =[638] 7fffeafdf000
[  327.261866] >>[638] ->[668/0]
[  327.269426] 
[  327.269426] INVALIDATE_PAGE [638] 7ffff0feb000 [653/0]
[  327.269759]  =[638] 7ffff0feb000
[  327.273262] >>[638] ->[653/0]
[  327.277520] 
[  327.277520] INVALIDATE_PAGE [638] 7fffe27de000 [666/0]
[  327.277849]  =[638] 7fffe27de000
[  327.278001] >>[638] ->[666/0]
[  327.285442] 
[  327.285442] INVALIDATE_PAGE [638] 7fffee7e6000 [658/0]
[  327.285742]  =[638] 7fffee7e6000
[  327.285890] >>[638] ->[658/0]
[  327.293430] 
[  327.293430] INVALIDATE_PAGE [638] 7ffff37f0000 [648/0]
[  327.293725]  =[638] 7ffff37f0000
[  327.293867] >>[638] ->[648/0]
[  327.305504] 
[  327.305504] INVALIDATE_PAGE [638] 7ffff67f6000 [642/0]
[  327.305881]  =[638] 7ffff67f6000
[  327.306026] >>[638] ->[642/0]
[  327.313484] 
[  327.313484] INVALIDATE_PAGE [638] 7ffff1fed000 [651/0]
[  327.313833]  =[638] 7ffff1fed000
[  327.314014] >>[638] ->[651/0]
[  327.321580] 
[  327.321580] INVALIDATE_PAGE [638] 7fffeffe9000 [655/0]
[  327.321922]  =[638] 7fffeffe9000
[  327.322084] >>[638] ->[655/0]
[  327.329541] 
[  327.329541] INVALIDATE_PAGE [638] 7ffff17ec000 [652/0]
[  327.329794]  =[638] 7ffff17ec000
[  327.329893] >>[638] ->[652/0]
[  327.337458] 
[  327.337458] INVALIDATE_PAGE [638] 7fffeefe7000 [657/0]
[  327.337677]  =[638] 7fffeefe7000
[  327.337766] >>[638] ->[657/0]
[  327.345613] 
[  327.345613] INVALIDATE_PAGE [638] 7fffeb7e0000 [667/0]
[  327.345983]  =[638] 7fffeb7e0000
[  327.346160] >>[638] ->[667/0]
[  327.353438] 
[  327.353438] INVALIDATE_PAGE [638] 7fffe9fdd000 [670/0]
[  327.353737]  =[638] 7fffe9fdd000
[  327.353881] >>[638] ->[670/0]
[  327.361480] 
[  327.361480] INVALIDATE_PAGE [638] 7fffecfe3000 [661/0]
[  327.361800]  =[638] 7fffecfe3000
[  327.361945] >>[638] ->[661/0]
[  327.369433] 
[  327.369433] INVALIDATE_PAGE [638] 7ffff77f8000 [640/0]
[  327.369746]  =[638] 7ffff77f8000
[  327.369892] >>[638] ->[640/0]
[  327.409514] 
[  327.409514] VMA_OP_REQUEST [638] munmap 7ffff77f9000 801000

mt spawns a number of threads and migrates each back and forth 100 times. However, the bug it seems to consistently appear after the last migration after migrating 100 times (i.e., the end of mt's execution). This bug appears even when the program only loops once, although the kernel output is different in that only the NULL pointer dereference error appears.

I.e., here is example output from the origin kernel of mt only looping once

[12348.446032] EXITED [945] local / 0x0
[12348.447904] EXITED [955] local / 0x0
[12348.448514] EXITED [938] local / 0x0
[12348.448881] EXITED [953] local / 0x0
[12348.449236] EXITED [957] local / 0x0
[12348.449593] EXITED [936] local / 0x0
[12348.452095]   [931] ->munmap [10317/1] 7ffff77f9000+801000
[12348.452920] EXITED [943] local / 0x0
[12348.453716] EXITED [960] local / 0x0
[12655.261782] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[12655.262371] PGD 1382e3067 P4D 1382e3067 PUD 138262067 PMD 0 
[12655.262759] Oops: 0000 [#7] SMP NOPTI
[12655.263009] CPU: 0 PID: 931 Comm: mt Tainted: G      D           4.19.0-rc5-popcorn+ #7
[12655.263503] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20181104
[12655.264148] RIP: 0010:___cache_free+0x19/0x1b0
[12655.264438] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[12655.265580] RSP: 0018:ffffc900009bbe08 EFLAGS: 00000092
[12655.265590] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[12655.265590] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[12655.265590] RBP: ffffffffffffff92 R08: 00000b82885f6c00 R09: 0000000000000000
[12655.265590] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[12655.265590] R13: ffff880138232400 R14: 0000000000000001 R15: 0000000000000000
[12655.265590] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[12655.265590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12655.265590] CR2: 0000000000000000 CR3: 000000013828e000 CR4: 00000000000006f0
[12655.265590] Call Trace:
[12655.265590]  ? wait_for_common_io.constprop.2+0xab/0x130
[12655.265590]  ? vma_server_munmap_origin+0x122/0x130
[12655.265590]  kfree+0x7a/0xd0
[12655.265590]  vma_server_munmap_origin+0x122/0x130
[12655.265590]  __x64_sys_munmap+0x6f/0x80
[12655.265590]  do_syscall_64+0x64/0x310
[12655.265590]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[12655.265590] RIP: 0033:0x44408b
[12655.265590] Code: c7 c2 b8 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff e9 77 ff ff ff8
[12655.265590] RSP: 002b:00007fffffffe1b8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[12655.265590] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044408b
[12655.265590] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[12655.265590] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[12655.265590] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000004c20f0
[12655.265590] R13: 0000000002800000 R14: 0000000000000000 R15: 0000000000000000
[12655.265590] Modules linked in: msg_socket
[12655.265590] CR2: 0000000000000000
[12655.265590] ---[ end trace 27318a798b6d1dba ]---
[12655.265590] RIP: 0010:___cache_free+0x19/0x1b0
[12655.265590] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[12655.265590] RSP: 0018:ffffc90000617e08 EFLAGS: 00000092
[12655.265590] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[12655.265590] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[12655.265590] RBP: ffffffffffffff92 R08: 0000008abd736400 R09: 0000000000000000
[12655.265590] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[12655.265590] R13: ffff88013a68bc00 R14: 0000000000000001 R15: 0000000000000000
[12655.265590] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[12655.265590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12655.265590] CR2: 0000000000000000 CR3: 000000013828e000 CR4: 00000000000006f0
[12655.269828] EXITED [931] local / 0x9
[12655.270074] TERMINATE [10317/1] with 0x9
[12655.275913]   [931] zap 400000
[12655.276095]   [931] zap 401000
[12655.276171]   [931] zap 402000
[12655.276247]   [931] zap 403000
[12655.276323]   [931] zap 404000
[12655.276467]   [931] zap 405000
[12655.276575]   [931] zap 406000
[12655.276688]   [931] zap 407000
[12655.276785]   [931] zap 408000

Here is the log on the origin after only running loop once

[  337.766748] EXITED [735] local / 0x0
[  337.767063] EXITED [743] local / 0x0
[  337.767605] >>[718] 7ffff77f8000 000000002ee79620
[  337.767847] EXITED [726] local / 0x0
[  337.768577] EXITED [728] local / 0x0
[  337.770187] EXITED [749] local / 0x0
[  337.770635] EXITED [720] local / 0x0
[  337.771511]   [718] ->munmap [650/1] 7ffff77f9000+801000
[  337.772156] EXITED [748] local / 0x0
[  337.773772] EXITED [740] local / 0x0
[  485.342701] INFO: task mt:718 blocked for more than 120 seconds.
[  485.345017]       Not tainted 4.19.0-rc5-popcorn+ #217
[  485.346533] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  485.352218] mt              D    0   718    622 0x00000000
[  485.356474] Call Trace:
[  485.359279]  ? __schedule+0x1e7/0x790
[  485.360186]  ? __switch_to_asm+0x40/0x70
[  485.361582]  schedule+0x28/0x80
[  485.364601]  schedule_timeout+0x1c2/0x460
[  485.365670]  ? collect_expired_timers+0xa0/0xa0
[  485.366523]  io_schedule_timeout+0x19/0x40
[  485.367875]  wait_for_common_io.constprop.2+0xa0/0x130
[  485.370794]  ? wake_up_q+0x70/0x70
[  485.371497]  wait_at_station+0x39/0x60
[  485.372547]  vma_server_munmap_origin+0x11a/0x130
[  485.374364]  __x64_sys_munmap+0x6f/0x80
[  485.378599]  do_syscall_64+0x64/0x310
[  485.379627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  485.381270] RIP: 0033:0x44408b
[  485.382600] Code: Bad RIP value.
[  485.383525] RSP: 002b:00007fffffffe1b8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[  485.386518] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044408b
[  485.388518] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  485.390492] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  485.394596] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000004c20f0
[  485.396492] R13: 0000000002800000 R14: 0000000000000000 R15: 0000000000000000
[  606.182191] INFO: task mt:718 blocked for more than 120 seconds.
[  606.182725]       Not tainted 4.19.0-rc5-popcorn+ #217
[  606.183112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  606.183498] mt              D    0   718    622 0x00000000
[  606.183773] Call Trace:
[  606.183964]  ? __schedule+0x1e7/0x790
[  606.184193]  ? __switch_to_asm+0x40/0x70
[  606.184450]  schedule+0x28/0x80
[  606.184633]  schedule_timeout+0x1c2/0x460
[  606.184865]  ? collect_expired_timers+0xa0/0xa0
[  606.185131]  io_schedule_timeout+0x19/0x40
[  606.185480]  wait_for_common_io.constprop.2+0xa0/0x130
[  606.186091]  ? wake_up_q+0x70/0x70
[  606.186322]  wait_at_station+0x39/0x60
[  606.186515]  vma_server_munmap_origin+0x11a/0x130
[  606.186889]  __x64_sys_munmap+0x6f/0x80
[  606.187111]  do_syscall_64+0x64/0x310
[  606.187295]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  606.187582] RIP: 0033:0x44408b
[  606.187798] Code: Bad RIP value.
[  606.187955] RSP: 002b:00007fffffffe1b8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[  606.188294] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044408b
[  606.188643] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  606.188958] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  606.189252] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000004c20f0
[  606.190066] R13: 0000000002800000 R14: 0000000000000000 R15: 0000000000000000
[  645.097203] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  645.097773] PGD 13a10e067 P4D 13a10e067 PUD 13a10f067 PMD 0 
[  645.098402] Oops: 0000 [#1] SMP NOPTI
[  645.098758] CPU: 0 PID: 718 Comm: mt Not tainted 4.19.0-rc5-popcorn+ #217
[  645.099273] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20181104
[  645.100072] RIP: 0010:___cache_free+0x19/0x1b0
[  645.100433] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[  645.100473] RSP: 0018:ffffc900001efe08 EFLAGS: 00000092
[  645.100473] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[  645.100473] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[  645.100473] RBP: ffffffffffffff92 R08: 0000009632bf4c00 R09: 0000000000000000
[  645.100473] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[  645.100473] R13: ffff88013a3c4800 R14: 0000000000000001 R15: 0000000000000000
[  645.100473] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[  645.100473] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  645.100473] CR2: 0000000000000000 CR3: 000000013a18a000 CR4: 00000000000006f0
[  645.100473] Call Trace:
[  645.100473]  ? wait_for_common_io.constprop.2+0xab/0x130
[  645.100473]  ? vma_server_munmap_origin+0x122/0x130
[  645.100473]  kfree+0x7a/0xd0
[  645.100473]  vma_server_munmap_origin+0x122/0x130
[  645.100473]  __x64_sys_munmap+0x6f/0x80
[  645.100473]  do_syscall_64+0x64/0x310
[  645.100473]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  645.100473] RIP: 0033:0x44408b
[  645.100473] Code: c7 c2 b8 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff e9 77 ff ff ff8
[  645.100473] RSP: 002b:00007fffffffe1b8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[  645.100473] RAX: ffffffffffffffda RBX: 00007ffff7ff99c0 RCX: 000000000044408b
[  645.100473] RDX: 0000000000000000 RSI: 0000000000801000 RDI: 00007ffff77f9000
[  645.100473] RBP: 00007ffff77f89c0 R08: 0000000000000001 R09: 0000000000000000
[  645.100473] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000004c20f0
[  645.100473] R13: 0000000002800000 R14: 0000000000000000 R15: 0000000000000000
[  645.100473] Modules linked in: msg_socket
[  645.100473] CR2: 0000000000000000
[  645.100473] ---[ end trace 53d25fd39f776c31 ]---
[  645.100473] RIP: 0010:___cache_free+0x19/0x1b0
[  645.100473] Code: 0f 94 c0 c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 413
[  645.100473] RSP: 0018:ffffc900001efe08 EFLAGS: 00000092
[  645.100473] RAX: ffffea0001bfffc8 RBX: 0000000000000286 RCX: ffffea0001bfffcf
[  645.100473] RDX: ffffffff811429c2 RSI: ffffffffffffff92 RDI: 0000000000000000
[  645.100473] RBP: ffffffffffffff92 R08: 0000009632bf4c00 R09: 0000000000000000
[  645.100473] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff811429c2
[  645.100473] R13: ffff88013a3c4800 R14: 0000000000000001 R15: 0000000000000000
[  645.100473] FS:  00000000004cb880(0000) GS:ffff88013fc00000(0000) knlGS:000000000000000
[  645.100473] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  645.100473] CR2: 0000000000000000 CR3: 000000013a18a000 CR4: 00000000000006f0
[  645.100571] EXITED [718] local / 0x9
[  645.100781] TERMINATE [650/1] with 0x9
[  645.101371]   [718] zap 400000
[  645.101592]   [718] zap 401000
[  645.104545]   [718] zap 402000
[  645.104677]   [718] zap 403000
[  645.104791]   [718] zap 404000
[  645.104928]   [718] zap 405000
[  645.105063]   [718] zap 406000
[  645.105185]   [718] zap 407000
[  645.105326]   [718] zap 408000
[  645.105456]   [718] zap 409000
[  645.105602]   [718] zap 40a000
[  645.105732]   [718] zap 40b000
[  645.105887]   [718] zap 40c000
[  645.106032]   [718] zap 40d000
[  645.106164]   [718] zap 40e000
[  645.106268]   [718] zap 40f000
[  645.106411]   [718] zap 410000
[  645.106568]   [718] zap 411000
[  645.106714]   [718] zap 412000
[  645.106840]   [718] zap 413000

With LOOP=1,

bxatnarf commented 5 years ago

There seem to be two bugs that cause this behavior:

  1. When pthread_join is invoked, it eventually calls munmap. This causes popcorn to send a PCN_KMSG_TYPE_VMA_OP_REQUEST message to the other host and block until it receives a response.
  2. When it finally times out, pcn_kmsg_done is called with the return value instead of a message pointer. pcn_kmsg_done assumes its argument is a struct pcn_kmsg_hdr pointer and subsequently tries to kfree() it, resulting in a crash
bxatnarf commented 5 years ago
2. When it finally times out, `pcn_kmsg_done` is called with the return value instead of a message pointer. `pcn_kmsg_done` assumes its argument is a `struct pcn_kmsg_hdr` pointer and subsequently tries to `kfree()` it, resulting in a crash

There are other instances of improper use of pcn_kmsg_done, we should hunt these all down and fix them - see issue https://github.com/ssrg-vt/popcorn-kernel/issues/66

bxatnarf commented 5 years ago

Fixed by https://github.com/ssrg-vt/popcorn-kernel/commit/071f622ec709327f28cae92487bedd833e17bf72