Closed bxatnarf closed 4 years ago
I sometimes see the following errors while loading the popcorn msg layer. insmod does not return on the affected when this happens, although it does return successfully on the unaffected host.
insmod
[ 1378.105543] BUG: unable to handle page fault for address: ffff8881399ae000 [ 1378.105973] #PF: supervisor write access in kernel mode [ 1378.106139] #PF: error_code(0x000b) - reserved bit violation [ 1378.106348] PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 139a90063 PTE 800ffffec6651063 [ 1378.106777] Oops: 000b [#1] SMP NOPTI [ 1378.106968] CPU: 0 PID: 639 Comm: sudo Not tainted 5.2.0-rc4-popcorn+ #1 [ 1378.107157] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1378.107532] RIP: 0010:__tlb_remove_page_size+0x68/0x90 [ 1378.107713] Code: e0 5b 41 5c c3 83 7f 1c 13 74 33 31 f6 bf 00 28 00 00 e8 bb f7 00 00c [ 1378.108188] RSP: 0018:ffffc9000070bc70 EFLAGS: 00000202 [ 1378.108371] RAX: ffff8881399ae000 RBX: ffffc9000070bda0 RCX: 000001fe00000000 [ 1378.108371] RDX: ffff888000000000 RSI: 00000000ffffffff RDI: 0000000000000246 [ 1378.108371] RBP: ffffea00043c5640 R08: 0000000000000000 R09: 0000000000000001 [ 1378.108371] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 1378.108371] R13: 000055555555c000 R14: ffffc9000070bda0 R15: ffff888139a1ea10 [ 1378.108371] FS: 0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1378.108371] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1378.108371] CR2: ffff8881399ae000 CR3: 0000000139b40000 CR4: 00000000000006f0 [ 1378.108371] Call Trace: [ 1378.108371] unmap_page_range+0x4c0/0x800 [ 1378.108371] unmap_vmas+0x32/0x50 [ 1378.108371] exit_mmap+0x8e/0x160 [ 1378.108371] mmput+0x41/0xf0 [ 1378.108371] do_exit+0x2bb/0xba0 [ 1378.108371] ? sched_clock_local+0x12/0x80 [ 1378.108371] do_group_exit+0x39/0xb0 [ 1378.108371] __x64_sys_exit_group+0x14/0x20 [ 1378.108371] do_syscall_64+0x69/0x440 [ 1378.108371] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1378.108371] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1378.108371] RIP: 0033:0x7ffff72992e9 [ 1378.108371] Code: 00 41 b8 3c 00 00 00 eb 19 0f 1f 84 00 00 00 00 00 48 89 d7 44 89 c0e [ 1378.108371] RSP: 002b:00007fffffffe458 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 1378.108371] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff72992e9 [ 1378.108371] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 1378.108371] RBP: 00007ffff7580860 R08: 000000000000003c R09: 00000000000000e7 [ 1378.108371] R10: fffffffffffffeb0 R11: 0000000000000246 R12: 00007ffff7580860 [ 1378.108371] R13: 00007ffff7585c60 R14: 00005555557845d0 R15: 00005555557845d0 [ 1378.108371] Modules linked in: msg_socket [ 1378.108371] CR2: ffff8881399ae000 [ 1378.108371] ---[ end trace 11f90f2492eb93bb ]--- [ 1378.108371] RIP: 0010:__tlb_remove_page_size+0x68/0x90 [ 1378.108371] Code: e0 5b 41 5c c3 83 7f 1c 13 74 33 31 f6 bf 00 28 00 00 e8 bb f7 00 00c [ 1378.108371] RSP: 0018:ffffc9000070bc70 EFLAGS: 00000202 [ 1378.108371] RAX: ffff8881399ae000 RBX: ffffc9000070bda0 RCX: 000001fe00000000 [ 1378.108371] RDX: ffff888000000000 RSI: 00000000ffffffff RDI: 0000000000000246 [ 1378.108371] RBP: ffffea00043c5640 R08: 0000000000000000 R09: 0000000000000001 [ 1378.108371] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 1378.108371] R13: 000055555555c000 R14: ffffc9000070bda0 R15: ffff888139a1ea10 [ 1378.108371] FS: 0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1378.108371] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1378.108371] CR2: ffff8881399ae000 CR3: 0000000139b40000 CR4: 00000000000006f0 [ 1378.108371] Fixing recursive fault but reboot is needed! [ 1378.108371] BUG: scheduling while atomic: sudo/639/0x00000002 [ 1378.108371] INFO: lockdep is turned off. [ 1378.108371] Modules linked in: msg_socket [ 1378.108371] irq event stamp: 13366 [ 1378.108371] hardirqs last enabled at (13365): [<ffffffff811d4dec>] get_page_from_free0 [ 1378.108371] hardirqs last disabled at (13366): [<ffffffff81001a1c>] trace_hardirqs_offc [ 1378.108371] softirqs last enabled at (13344): [<ffffffff8180032e>] __do_softirq+0x32e9 [ 1378.108371] softirqs last disabled at (13331): [<ffffffff81068377>] irq_exit+0x97/0xd0 [ 1378.108371] CPU: 0 PID: 639 Comm: sudo Tainted: G D 5.2.0-rc4-popcorn+ 1 [ 1378.108371] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1378.108371] Call Trace: [ 1378.108371] dump_stack+0x67/0x90 [ 1378.108371] __schedule_bug.cold+0x1a/0x27 [ 1378.108371] __schedule+0x5a2/0x830 [ 1378.108371] ? printk+0x58/0x6f [ 1378.108371] schedule+0x3a/0xb0 [ 1378.108371] do_exit.cold+0x62/0x91 [ 1378.108371] rewind_stack_do_exit+0x17/0x20 [ 1387.253528] BUG: unable to handle page fault for address: ffff88813a2ef000 [ 1387.253849] #PF: supervisor write access in kernel mode [ 1387.254105] #PF: error_code(0x000b) - reserved bit violation [ 1387.254375] PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 13a3e5063 PTE 800ffffec5d10063 [ 1387.254716] Oops: 000b [#2] SMP NOPTI [ 1387.254873] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.2.0-rc4-popcorn+1 [ 1387.255223] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1387.255642] RIP: 0010:cache_alloc_refill+0x3db/0x6b0 [ 1387.255860] Code: 8b 57 24 31 db 85 d2 74 2d 49 8b 47 50 48 85 c0 74 11 89 df 41 0f afb [ 1387.256771] RSP: 0018:ffffc9000005bcf8 EFLAGS: 00000246 [ 1387.256943] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888000000000 [ 1387.256943] RDX: ffff88813a2ef000 RSI: 0000000000000006 RDI: 0000000040002000 [ 1387.256943] RBP: 0000000000000400 R08: 00000000001fc1cf R09: 0000000000000000 [ 1387.256943] R10: 0000000000000001 R11: 00000000001fc1c8 R12: ffffea00044ba448 [ 1387.256943] R13: 0000000000000cc0 R14: 000000000000000c R15: ffff88813b0006c0 [ 1387.256943] FS: 00007f8aef9b8880(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1387.256943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1387.256943] CR2: ffff88813a2ef000 CR3: 000000013a2ae000 CR4: 00000000000006f0 [ 1387.256943] Call Trace: [ 1387.256943] kmem_cache_alloc_trace+0x1f5/0x240 [ 1387.256943] proc_cgroup_show+0x30/0x2a0 [ 1387.256943] proc_single_show+0x51/0x90 [ 1387.256943] seq_read+0xd5/0x400 [ 1387.256943] vfs_read+0xb2/0x170 [ 1387.256943] ksys_read+0x68/0xe0 [ 1387.256943] do_syscall_64+0x69/0x440 [ 1387.256943] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1387.256943] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1387.256943] RIP: 0033:0x7f8aef2baba0 [ 1387.256943] Code: 0b 31 c0 48 83 c4 08 e9 be fe ff ff 48 8d 3d 3f f0 08 00 e8 e2 ce 014 [ 1387.256943] RSP: 002b:00007ffe0cff6be8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1387.256943] RAX: ffffffffffffffda RBX: 00005591dfc7c270 RCX: 00007f8aef2baba0 [ 1387.256943] RDX: 0000000000000400 RSI: 00007f8aef9c3000 RDI: 000000000000000d [ 1387.256943] RBP: 000000000000000a R08: 00000000ffffffff R09: 0000000000000000 [ 1387.256943] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000000 [ 1387.256943] R13: 0000000000000000 R14: 00005591dfc7c270 R15: 00000000000007ff [ 1387.256943] Modules linked in: msg_socket [ 1387.256943] CR2: ffff88813a2ef000 [ 1387.256943] ---[ end trace 11f90f2492eb93bc ]--- [ 1387.256943] RIP: 0010:__tlb_remove_page_size+0x68/0x90 [ 1387.256943] Code: e0 5b 41 5c c3 83 7f 1c 13 74 33 31 f6 bf 00 28 00 00 e8 bb f7 00 00c [ 1387.256943] RSP: 0018:ffffc9000070bc70 EFLAGS: 00000202 [ 1387.256943] RAX: ffff8881399ae000 RBX: ffffc9000070bda0 RCX: 000001fe00000000 [ 1387.256943] RDX: ffff888000000000 RSI: 00000000ffffffff RDI: 0000000000000246 [ 1387.256943] RBP: ffffea00043c5640 R08: 0000000000000000 R09: 0000000000000001 [ 1387.256943] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 1387.256943] R13: 000055555555c000 R14: ffffc9000070bda0 R15: ffff888139a1ea10 [ 1387.256943] FS: 00007f8aef9b8880(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1387.256943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1387.256943] CR2: ffff88813a2ef000 CR3: 000000013a2ae000 CR4: 00000000000006f0 [ 1387.256943] BUG: sleeping function called from invalid context at include/linux/percpu4 [ 1387.256943] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: systemd [ 1387.256943] INFO: lockdep is turned off. [ 1387.256943] irq event stamp: 488400 [ 1387.256943] hardirqs last enabled at (488399): [<ffffffff8154d189>] _raw_spin_unlock_0 [ 1387.256943] hardirqs last disabled at (488400): [<ffffffff81546757>] __schedule+0xb7/00 [ 1387.256943] softirqs last enabled at (488246): [<ffffffff814f37db>] unix_sock_destruc0 [ 1387.256943] softirqs last disabled at (488244): [<ffffffff814f37db>] unix_sock_destruc0 [ 1387.256943] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.2.0-rc4-popcorn+1 [ 1387.256943] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1387.256943] Call Trace: [ 1387.256943] dump_stack+0x67/0x90 [ 1387.256943] ___might_sleep.cold+0x9f/0xaf [ 1387.256943] exit_signals+0x1c/0x200 [ 1387.256943] do_exit+0xb0/0xba0 [ 1387.256943] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1387.256943] rewind_stack_do_exit+0x17/0x20 [ 1387.257135] BUG: unable to handle page fault for address: ffff88813a2ee000 [ 1387.257435] #PF: supervisor write access in kernel mode [ 1387.257589] #PF: error_code(0x000b) - reserved bit violation [ 1387.257853] PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 13a3e5063 PTE 800ffffec5d11063 [ 1387.258199] Oops: 000b [#3] SMP NOPTI [ 1387.258368] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.2.0-rc4-popcorn+1 [ 1387.258714] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1387.259054] RIP: 0010:__tlb_remove_page_size+0x68/0x90 [ 1387.259197] Code: e0 5b 41 5c c3 83 7f 1c 13 74 33 31 f6 bf 00 28 00 00 e8 bb f7 00 00c [ 1387.259968] RSP: 0018:ffffc9000005bd10 EFLAGS: 00000202 [ 1387.260129] RAX: ffff88813a2ee000 RBX: ffffc9000005be40 RCX: 000001fe00000000 [ 1387.260533] RDX: ffff888000000000 RSI: 0000000000000000 RDI: ffffffff811d4dec [ 1387.260843] RBP: ffffea00045e6568 R08: 00000000001fc1c8 R09: 0000000000000000 [ 1387.260943] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 1387.260943] R13: 00005591de19f000 R14: ffffc9000005be40 R15: ffff88813a18cf18 [ 1387.260943] FS: 0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1387.260943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1387.260943] CR2: ffff88813a2ee000 CR3: 000000013a2ae000 CR4: 00000000000006f0 [ 1387.260943] Call Trace: [ 1387.260943] unmap_page_range+0x4c0/0x800 [ 1387.260943] unmap_vmas+0x32/0x50 [ 1387.260943] exit_mmap+0x8e/0x160 [ 1387.260943] mmput+0x41/0xf0 [ 1387.260943] do_exit+0x2bb/0xba0 [ 1387.260943] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1387.260943] rewind_stack_do_exit+0x17/0x20 [ 1387.260943] Modules linked in: msg_socket [ 1387.260943] CR2: ffff88813a2ee000 [ 1387.260943] ---[ end trace 11f90f2492eb93bd ]--- [ 1387.260943] RIP: 0010:__tlb_remove_page_size+0x68/0x90 [ 1387.260943] Code: e0 5b 41 5c c3 83 7f 1c 13 74 33 31 f6 bf 00 28 00 00 e8 bb f7 00 00c [ 1387.260943] RSP: 0018:ffffc9000070bc70 EFLAGS: 00000202 [ 1387.260943] RAX: ffff8881399ae000 RBX: ffffc9000070bda0 RCX: 000001fe00000000 [ 1387.260943] RDX: ffff888000000000 RSI: 00000000ffffffff RDI: 0000000000000246 [ 1387.260943] RBP: ffffea00043c5640 R08: 0000000000000000 R09: 0000000000000001 [ 1387.260943] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 1387.260943] R13: 000055555555c000 R14: ffffc9000070bda0 R15: ffff888139a1ea10 [ 1387.260943] FS: 0000000000000000(0000) GS:ffff88813b600000(0000) knlGS:000000000000000 [ 1387.260943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1387.260943] CR2: ffff88813a2ee000 CR3: 000000013a2ae000 CR4: 00000000000006f0 [ 1387.260943] Fixing recursive fault but reboot is needed! [ 1387.260943] BUG: scheduling while atomic: systemd/1/0x00000002 [ 1387.260943] INFO: lockdep is turned off. [ 1387.260943] Modules linked in: msg_socket [ 1387.260943] irq event stamp: 488400 [ 1387.260943] hardirqs last enabled at (488399): [<ffffffff8154d189>] _raw_spin_unlock_0 [ 1387.260943] hardirqs last disabled at (488400): [<ffffffff81546757>] __schedule+0xb7/00 [ 1387.260943] softirqs last enabled at (488246): [<ffffffff814f37db>] unix_sock_destruc0 [ 1387.260943] softirqs last disabled at (488244): [<ffffffff814f37db>] unix_sock_destruc0 [ 1387.260943] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.2.0-rc4-popcorn+1 [ 1387.260943] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181124 [ 1387.260943] Call Trace: [ 1387.260943] dump_stack+0x67/0x90 [ 1387.260943] __schedule_bug.cold+0x1a/0x27 [ 1387.260943] __schedule+0x5a2/0x830 [ 1387.260943] ? printk+0x58/0x6f [ 1387.260943] schedule+0x3a/0xb0 [ 1387.260943] do_exit.cold+0x62/0x91 [ 1387.260943] rewind_stack_do_exit+0x17/0x20
Closing for now in relation to the referenced message in #91.
I sometimes see the following errors while loading the popcorn msg layer.
insmod
does not return on the affected when this happens, although it does return successfully on the unaffected host.