Closed cornejo closed 4 years ago
This must be the same kernel lockup that happens every few minutes when using LLDB. It's great you can reproduce this and get a backtrace! But it seems your backtrace is missing symbols for our LKM (the ??
s), is there any way you can get those? (I don't know enough about kernel debugging to suggest how, but symbol names & debuginfo seem to be present in darling-mach.ko
as-is).
I've downloaded kernel source and given the kernel module to gdb - it finds no debug symbols unfortunately, but I'm still able to get the function addresses.
Repeating the previous test I crashed with the following:
(gdb) add-symbol-file darling-mach.ko 0xffffffffc05ee000 -s .data 0xffffffffc0641020 -s .bss 0xffffffffc06417c0 add symbol table from file "darling-mach.ko" at .text_addr = 0xffffffffc05ee000 .data_addr = 0xffffffffc0641020 .bss_addr = 0xffffffffc06417c0 (y or n) y Reading symbols from darling-mach.ko...(no debugging symbols found)...done. (gdb) c Continuing.
In testing whether disk io was a major contributor to this issue I was able to replicate it with the following:
while true; do date; done
Four connections into my test vm was sufficient to have it hang in a short amount of time. Here's the kernel backtrace
(gdb) add-symbol-file darling-mach.ko 0xffffffffc0627000 -s .data 0xffffffffc067a020 -s .bss 0xffffffffc067a7c0 add symbol table from file "darling-mach.ko" at .text_addr = 0xffffffffc0627000 .data_addr = 0xffffffffc067a020 .bss_addr = 0xffffffffc067a7c0 (y or n) y Reading symbols from darling-mach.ko...(no debugging symbols found)...done. (gdb) c Continuing.
FWIW here's a dump of a failure with a debug build of the kernel module. The issue appears to be a deadlock regarding getting a lock on the current task.
(gdb) target remote :1234 Remote debugging using :1234 native_safe_halt () at /build/linux-uQJ2um/linux-4.15.0/arch/x86/include/asm/irqflags.h:55 55 } (gdb) add-symbol-file darling-mach.ko 0xffffffffc0624000 -s .data 0xffffffffc0677020 -s .bss 0xffffffffc0624000 add symbol table from file "darling-mach.ko" at .text_addr = 0xffffffffc0624000 .data_addr = 0xffffffffc0677020 .bss_addr = 0xffffffffc0624000 (y or n) y Reading symbols from darling-mach.ko...done. (gdb) c Continuing.
I recompiled the kernel module with printk messages whenever acquiring or releasing locks in task_registry.c. Unfortunately it doesn't seem to have revealed the cause
[ 63.63506333] Darling Mach: <1269> Attempting to acquire read lock
[ 63.63506746] Darling Mach: <1269> Released read lock
[ 63.63570438] not implemented: thread_poll_yield()
[ 63.63580171] not implemented: thread_poll_yield()
[ 63.63635383] not implemented: thread_poll_yield()
[ 63.63649963] not implemented: thread_poll_yield()
[ 63.63720524] not implemented: thread_poll_yield()
[ 63.63752840] not implemented: thread_poll_yield()
[ 63.63767189] Darling Mach: <1269> Attempting to acquire read lock
[ 63.63767772] Darling Mach: <1269> Released read lock
[ 63.63768821] Darling Mach: <1269> Attempting to acquire read lock
[ 63.63769213] Darling Mach: <1269> Released read lock
[ 63.65765180] Darling Mach: <5723> Attempting to acquire read lock
[ 63.65766462] Darling Mach: <5723> Released read lock
[ 63.65821582] Darling Mach: <6646> Attempting to acquire read lock
[ 63.65822188] Darling Mach: <6646> Released read lock
[ 63.65822747] Darling Mach: <6646> Attempting to acquire read lock
[ 63.65823210] Darling Mach: <6646> Released read lock
[ 63.65826107] Darling Mach: <6646> Attempting to acquire read lock
[ 63.65826646] Darling Mach: <6646> Released read lock
[ 63.65828627] Darling Mach: <6646> Attempting to acquire read lock
[ 63.65829032] Darling Mach: <6646> Released read lock
[ 63.65999107] Darling Mach: <1252> Attempting to acquire read lock
[ 63.66000234] Darling Mach: <1252> Released read lock
[ 63.66071284] Darling Mach: <1252> Attempting to acquire read lock
[ 63.66072020] Darling Mach: <1252> Released read lock
[ 63.66166439] not implemented: thread_poll_yield()
[ 63.66217029] not implemented: thread_poll_yield()
[ 63.66236840] Darling Mach: <1269> Attempting to acquire read lock
[ 63.66237526] Darling Mach: <1269> Released read lock
[ 63.66238492] Darling Mach: <1269> Attempting to acquire read lock
[ 63.66238893] Darling Mach: <1269> Released read lock
[ 63.66263753] There is a pending message on port
[ 63.66264623] THREAD_NULL
[ 63.66294079] not implemented: thread_poll_yield()
[ 63.66307881] not implemented: thread_poll_yield()
[ 63.66329683] There is a pending message on port
[ 63.66330380] THREAD_NULL
[ 63.66357552] not implemented: thread_poll_yield()
[ 63.66375244] not implemented: thread_poll_yield()
[ 63.66460184] Darling Mach: <6646> Attempting to acquire read lock
[ 63.66460868] Darling Mach: <6646> Released read lock
[ 63.66499138] not implemented: thread_poll_yield()
[ 63.66538170] not implemented: thread_poll_yield()
[ 63.66554047] Darling Mach: <1269> Attempting to acquire read lock
[ 63.73889354] Darling Mach: <1252> Attempting to acquire read lock
[ 63.84485274] Darling Mach: <1267> Attempting to acquire read lock
[ 84.16010186] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [launchd:1252]
[ 84.16036702] Modules linked in: overlay darling_mach(OE) binfmt_misc snd_hda_codec_generic joydev input_leds snd_hda_intel snd_hda_codec snd_hda_core serio_raw snd_hwdep snd_pcm snd_timer mac_hid snd soundcore qemu_fw_cfg sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc qxl ttm aesni_intel aes_x86_64 crypto_simd glue_helper cryptd drm_kms_helper syscopyarea sysfillrect psmouse sysimgblt virtio_blk floppy fb_sys_fops virtio_net drm i2c_piix4 pata_acpi
[ 84.16075356] CPU: 1 PID: 1252 Comm: launchd Tainted: G OE 4.15.0-45-generic #48-Ubuntu
[ 84.16076026] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 84.16094490] RIP: 0010:native_queued_spin_lock_slowpath+0x13/0x1a0
[ 84.16095289] RSP: 0018:ffffc90001fdfd28 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff11
[ 84.16096671] RAX: 0000000000000001 RBX: ffffffffc0671ae0 RCX: 0000000000000000
[ 84.16097352] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffffc0671ae4
[ 84.16098061] RBP: ffffc90001fdfd28 R08: 0000000000000001 R09: 0000000000016e32
[ 84.16098779] R10: 0000000000000040 R11: 0000000000000000 R12: ffffffffc0671ae4
[ 84.16099515] R13: ffff88040e867940 R14: 0000000000000000 R15: ffff880411b02c00
[ 84.16100852] FS: 00007ffff59fd800(0000) GS:ffff880427c80000(0000) knlGS:00007ffff7a11460
[ 84.16101625] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 84.16102343] CR2: 00007ffff7f29d10 CR3: 0000000410812000 CR4: 00000000000006e0
[ 84.16105474] Call Trace:
[ 84.16114162] queued_read_lock_slowpath+0x71/0x80
[ 84.16120623] _raw_read_lock+0x20/0x30
[ 84.16134153] my_read_lock.constprop.3+0x30/0x40 [darling_mach]
[ 84.16141691] darling_task_notify_register+0x19/0xb0 [darling_mach]
[ 84.16148334] evprocfd_create+0x93/0x180 [darling_mach]
[ 84.16155218] ? pid_get_state_entry+0x20/0x20 [darling_mach]
[ 84.16161688] ? set_tracer_entry+0xf0/0xf0 [darling_mach]
[ 84.16167285] evproc_create_entry+0x4d/0x80 [darling_mach]
[ 84.16173269] mach_dev_ioctl+0x1d3/0x230 [darling_mach]
[ 84.16184107] do_vfs_ioctl+0xa8/0x630
[ 84.16185904] ? fput+0x193/0x220
[ 84.16187705] SyS_ioctl+0x79/0x90
[ 84.16190149] do_syscall_64+0x73/0x130
[ 84.16192021] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 84.16193332] RIP: 0033:0x7ffff7c8eff3
[ 84.16193956] RSP: 002b:00007fffffdf9a58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 84.16195180] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ffff7c8eff3
[ 84.16195879] RDX: 00007fffffdf9b00 RSI: 0000000000001023 RDI: 00000000000003ff
[ 84.16196625] RBP: 00007fffffdf9a90 R08: 0000000000000000 R09: 0000000000000000
[ 84.16197319] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004023b0
[ 84.16197944] R13: 00007ffed6c53f00 R14: 00007fffffdf9ec0 R15: 0000000000000000
[ 84.16198853] Code: c0 75 cc 48 8b 45 c8 c6 03 01 48 c7 00 00 00 00 00 e9 65 fe ff ff 66 90 66 66 66 66 90 55 48 89 e5 66 66 66 66 90 ba 01 00 00 00 <8b> 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 eb ec 81
[ 84.24006938] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [notifyd:1267]
[ 84.24025864] Modules linked in: overlay darling_mach(OE) binfmt_misc snd_hda_codec_generic joydev input_leds snd_hda_intel snd_hda_codec snd_hda_core serio_raw snd_hwdep snd_pcm snd_timer mac_hid snd soundcore qemu_fw_cfg sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc qxl ttm aesni_intel aes_x86_64 crypto_simd glue_helper cryptd drm_kms_helper syscopyarea sysfillrect psmouse sysimgblt virtio_blk floppy fb_sys_fops virtio_net drm i2c_piix4 pata_acpi
[ 84.24053193] CPU: 3 PID: 1267 Comm: notifyd Tainted: G OEL 4.15.0-45-generic #48-Ubuntu
[ 84.24053781] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 84.24056378] RIP: 0010:native_queued_spin_lock_slowpath+0x23/0x1a0
[ 84.24057048] RSP: 0018:ffffc900023abc08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff11
[ 84.24058214] RAX: 0000000000000001 RBX: ffffffffc0671ae0 RCX: 0000000000000000
[ 84.24058823] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffffc0671ae4
[ 84.24059490] RBP: ffffc900023abc08 R08: 0000000000000001 R09: 0000000000016e33
[ 84.24060167] R10: ffffc900023abd70 R11: 0000000000000000 R12: ffffffffc0671ae4
[ 84.24060830] R13: 00007fb70f811158 R14: 0000000000e75003 R15: 0000000000e75003
[ 84.24061994] FS: 00007ffff47fe700(0000) GS:ffff880427d80000(0000) knlGS:00007ffff67d20e0
[ 84.24062745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 84.24063423] CR2: 00007ffff7bdc9ec CR3: 00000004149f0000 CR4: 00000000000006e0
[ 84.24066443] Call Trace:
[ 84.24069025] queued_read_lock_slowpath+0x71/0x80
[ 84.24071338] _raw_read_lock+0x20/0x30
[ 84.24080830] my_read_lock.constprop.3+0x30/0x40 [darling_mach]
[ 84.24087871] darling_task_get_current_entry+0xf/0x50 [darling_mach]
[ 84.24094038] darling_task_get_current+0xe/0x20 [darling_mach]
[ 84.24101790] ksyn_wqfind+0x6c/0x940 [darling_mach]
[ 84.24103601] ? switch_to_asm+0x34/0x70
[ 84.24104937] ? switch_to_asm+0x40/0x70
[ 84.24106255] ? switch_to_asm+0x34/0x70
[ 84.24107554] ? switch_to_asm+0x40/0x70
[ 84.24108837] ? switch_to_asm+0x34/0x70
[ 84.24110254] ? switch_to_asm+0x34/0x70
[ 84.24112724] ? __switch_to+0x321/0x500
[ 84.24114098] ? switch_to_asm+0x40/0x70
[ 84.24115386] ? __switch_to_asm+0x34/0x70
[ 84.24122846] psynch_mutexwait+0xb7/0x300 [darling_mach]
[ 84.24129156] ? psynch_mutexwait+0xb7/0x300 [darling_mach]
[ 84.24136368] ? path_at_entry+0x120/0x120 [darling_mach]
[ 84.24142468] psynch_mutexwait_trap+0x4a/0x80 [darling_mach]
[ 84.24148422] mach_dev_ioctl+0x1d3/0x230 [darling_mach]
[ 84.24150571] do_vfs_ioctl+0xa8/0x630
[ 84.24157796] ? wake_up_q+0x80/0x80
[ 84.24159756] SyS_ioctl+0x79/0x90
[ 84.24161611] do_syscall_64+0x73/0x130
[ 84.24163423] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 84.24164501] RIP: 0033:0x7ffff7cf3ff3
[ 84.24165110] RSP: 002b:00007ffff67d0808 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 84.24166257] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ffff7cf3ff3
[ 84.24166903] RDX: 00007ffff67d0878 RSI: 0000000000001012 RDI: 00000000000003ff
[ 84.24167535] RBP: 00007ffff67d0840 R08: 0000000000000000 R09: 0000000000000000
[ 84.24168159] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffff47fdf00
[ 84.24168773] R13: 0000000000000000 R14: 00007ffff67d0ce0 R15: 00007ffff6852340
[ 84.24169640] Code: 00 e9 65 fe ff ff 66 90 66 66 66 66 90 55 48 89 e5 66 66 66 66 90 ba 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3
I believe I have discovered the cause.
Here is the kernel output immediately prior to the hang.
[ 92.187225363] Darling Mach: <1395> 0(-1): Acquired read lock: 629, ffffffffc0677ae0, 200 [ 92.187226349] Darling Mach: <1395> 0(-1): Attempting to release read lock: 652, ffffffffc0677ae0, 200 [ 92.187227221] Darling Mach: <1395> 0(-1): Released read lock: 652, ffffffffc0677ae0, 0 [ 92.187239260] Darling Mach: <1804> 3(-1): Acquired read lock: 602, ffffffffc0677ae0, 200 [ 92.188502904] Darling Mach: <4111> 3(-1): Attempting to acquire write lock: 484, ffffffffc0677ae0, 200 [ 92.191991023] Darling Mach: <4110> 2(-1): Attempting to acquire write lock: 484, ffffffffc0677ae0, 100000300 [ 92.193358671] Darling Mach: <1410> 1(-1): Attempting to acquire read lock: 253, ffffffffc0677ae0, 0 [ 92.209096535] Darling Mach: <4114> 0(-1): Attempting to acquire read lock: 253, ffffffffc0677ae0, 100000300
Here's what each line indicates
Darling Mach:
Most of that you can ignore, but I'll step you through the relevant parts.
Initially you can see pid 1395, running on core 0, acquiring the read lock then releasing it. Subsequently pid 1804, running on core 3, acquires the read lock.
pid 4111, now running on core 3, attempts to acquire a write lock and starts spinning - consuming the core. pid 4110, running on core 2, attempts to also acquire a write lock - consuming the core pids 1410 and 4114, running on cores 1 and 0 respectively, attempt to acquire a read lock.
At this point the processor hangs. Observationally I must assume the pending write locks are preventing the read locks from being acquired (otherwise they should be able to)
All the cores are spinning waiting for pid 1804 to release the read lock, but unfortunately no core can schedule the process, so the system remains deadlocked.
I've solved it in my test code by switching from a rw spinlock to an rw semaphore. This allows blocked processes to sleep, yielding the cpu and allowing the process holding the lock to continue. Unfortunately in my testing I seem to have discovered another spinlock deadlock another module but I'll raise a ticket for that separately.
I'll also note that I'm not a kernel developer, but I'm a little concerned that there appears to be [theoretical] situations which could lead to a crash. For example:
A caller to darling_task_get(pid) internally obtains a reference to the registry_entry e, then releases the read lock. It is possible that e becomes invalid (say after a call to darling_task_free) upon which the struct that e points to is now undefined before it attempts to access e->task?
Admittedly this is unlikely to cause a crash, but it's significantly worse if you consider darling_task_post_notification where it uses e to walk a linked list.
These issues (if they do indeed exist) are very unlikely to manifest, but these bugs tend to be the hardest to track down. Of course, I may be well off base.
I'll try to get a PR up for your consideration soon.
Admittedly this is unlikely to cause a crash, but it's significantly worse if you consider darling_task_post_notification where it uses e to walk a linked list.
This function is only called in the fork case of creating a new Mach file descriptor:
// fork case only
if (ppid_fork != -1)
{
darling_task_fork_child_done();
darling_task_post_notification(ppid_fork, NOTE_FORK, task_pid_vnr(linux_current));
}
The parent process is frozen until darling_task_fork_child_done()
is called. I don't remember why we first let the parent process to continue before posting the notification (maybe we don't have to!), but it's unlikely the parent process would be fast enough to resume and terminate itself before darling_task_post_notification()
completes.
First of all thanks a lot @cornejo for the details and the investigation! Hopefully we can get this fixed now...
All the cores are spinning waiting for pid 1804 to release the read lock, but unfortunately no core can schedule the process, so the system remains deadlocked.
I'm not a kernel developer either, but my understanding (from reading this) is that kernelspace servicing a syscall ("user context") can't be preemted by another process (but can be preemted by interrupts — but we're not dealing with those in Darling Mach). So PID 1804 would lock the spinlock, do whatever it needs to do, unlock the spinlock, return from the syscall and only then another process will get scheduled on that core. Otherwise, spinlocks everywhere would be doomed.
So what you're seeing in the log, I would say, is better explained by some code forgetting to release the spinlock, or perhaps actually sleeping (with a mutex or something) while it holds the spinlock (in that case it can be preemted).
Again, I'm not a kernel developer, don't believe anything I say 😀
For one thing, darling_task_post_notification_internal()
aquires a mutex while (the caller is) holding the spinlock.
Ah fair. In fact it's therefore quite likely my other code changes may have led to an incorrect fix. I'm going to re-run my tests and get back to you soon.
I've reverted all my changes apart from logging when the locks are acquired/released. In my past few attempts I've been unable to replicate the hang I experienced initially. I'm not exactly sure what's different - however I was able to produce several crashes due to races being lost inside of task_registry.c (and one possibly in another location. I'm providing these crash dumps below and loading the kernel module into ghidra to get an understanding of what it was trying to do.
Also - I'd like to know if you've been able to replicate my findings. A tight "while true; do date; done", running in N+1 processes should work (N=number of cores)
I'll attempt to replicate my initial hang after dinner.
Crashes below:
[ 58.361904] Darling Mach: <5148> 1(-1): Attempting to acquire read lock: 223, ffffffffc066aae0, 0 [ 58.361905] Darling Mach: <5148> 1(-1): Acquired read lock: 223, ffffffffc066aae0, 200 [ 58.361906] Darling Mach: <5148> 1(-1): Attempting to release read lock: 225, ffffffffc066aae0, 200 [ 58.361907] Darling Mach: <5148> 1(-1): Released read lock: 225, ffffffffc066aae0, 0 [ 58.362741] Darling Mach: <1337> 2(-1): Attempting to acquire read lock: 618, ffffffffc066aae0, 0 [ 58.362742] Darling Mach: <1337> 2(-1): Acquired read lock: 618, ffffffffc066aae0, 200 [ 58.362743] Darling Mach: <1337> 2(-1): Attempting to release read lock: 622, ffffffffc066aae0, 200 [ 58.362744] Darling Mach: <1337> 2(-1): Released read lock: 622, ffffffffc066aae0, 0 [ 58.363015] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 58.363053] IP: darling_task_notify_register+0x8e/0xc0 [darling_mach] [ 58.363068] PGD 0 P4D 0 [ 58.363077] Oops: 0002 [#1] SMP PTI [ 58.363086] Modules linked in: overlay darling_mach(OE) binfmt_misc snd_hda_codec_generic joydev input_leds snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep serio_raw snd_pcm qemu_fw_cfg snd_timer mac_hid snd soundcore sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc qxl aesni_intel ttm aes_x86_64 drm_kms_helper crypto_simd glue_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_piix4 pata_acpi virtio_net virtio_blk psmouse drm floppy [ 58.363251] CPU: 0 PID: 1786 Comm: notifyd Tainted: G OE 4.15.0-46-generic #49-Ubuntu [ 58.363271] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 58.363303] RIP: 0010:darling_task_notify_register+0x8e/0xc0 [darling_mach] [ 58.363318] RSP: 0018:ffffc900020a7d78 EFLAGS: 00010246 [ 58.363331] RAX: 0000000000000000 RBX: ffff88040b0b3300 RCX: 0000000000014b76 [ 58.363347] RDX: ffff880412100000 RSI: ffff880412516d20 RDI: ffff88040b0b3370 [ 58.363363] RBP: ffffc900020a7d90 R08: ffff880427c27080 R09: 0000000000000000 [ 58.363379] R10: ffffc90008bf7a18 R11: 0000000000000073 R12: ffff880412516f40 [ 58.363394] R13: ffff88040b0b3370 R14: 0000000000000000 R15: ffff88040fbae800 [ 58.363411] FS: 00007ffff37fc700(0000) GS:ffff880427c00000(0000) knlGS:00007ffff2ffa0e0 [ 58.363429] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 58.363442] CR2: 0000000000000008 CR3: 000000040fa88000 CR4: 00000000000006f0 [ 58.363462] Call Trace: [ 58.363482] evprocfd_create+0x93/0x180 [darling_mach] [ 58.363503] ? set_tracer_entry+0xf0/0xf0 [darling_mach] [ 58.363522] evproc_create_entry+0x4d/0x80 [darling_mach] [ 58.363542] mach_dev_ioctl+0x1d3/0x230 [darling_mach] [ 58.363556] ? eventfd_write+0xd8/0x270 [ 58.363568] do_vfs_ioctl+0xa8/0x630 [ 58.363578] ? vfs_write+0x166/0x1a0 [ 58.363588] SyS_ioctl+0x79/0x90 [ 58.363598] do_syscall_64+0x73/0x130 [ 58.363609] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 58.363621] RIP: 0033:0x7ffff7cf3ff3 [ 58.363630] RSP: 002b:00007ffff2ff78a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 58.363648] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ffff7cf3ff3 [ 58.363663] RDX: 00007ffff2ff7950 RSI: 0000000000001023 RDI: 00000000000003ff [ 58.363679] RBP: 00007ffff2ff78e0 R08: 0000000000000000 R09: 0000000000000000 [ 58.363695] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffff37fbf00 [ 58.363711] R13: 0000000000000000 R14: 00007ffff2ff7d10 R15: 00007ffff67cf800 [ 58.363727] Code: 42 c2 ba 18 00 00 00 be c0 00 40 01 e8 9c ef c0 c0 4c 89 68 10 4c 8d 6b 70 49 89 c4 4c 89 ef e8 d9 3f 36 c1 48 8b 43 60 4c 89 ef <4c> 89 60 08 49 89 04 24 48 8d 43 60 49 89 44 24 08 4c 89 63 60 [ 58.363796] RIP: darling_task_notify_register+0x8e/0xc0 [darling_mach] RSP: ffffc900020a7d78 [ 58.363815] CR2: 0000000000000008 [ 58.363824] ---[ end trace b45797e8661496c6 ]---
[ 67.767555] Darling Mach: <5050> 2(-1): Acquired read lock: 263, ffffffffc066dae0, 200 [ 67.767557] Darling Mach: <5050> 2(-1): Attempting to release read lock: 269, ffffffffc066dae0, 200 [ 67.767558] Darling Mach: <5050> 2(-1): Released read lock: 269, ffffffffc066dae0, 0 [ 67.767561] Darling Mach: <5050> 2(-1): Attempting to acquire write lock: 364, ffffffffc066dae0, 0 [ 67.767563] Darling Mach: <5050> 2(2): Acquired write lock: 364, ffffffffc066dae0, ff [ 67.767564] Darling Mach: <5050> 2(2): Attempting to release write lock: 404, ffffffffc066dae0, ff [ 67.767565] Darling Mach: <5050> 2(-1): Released write lock: 404, ffffffffc066dae0, 0 [ 67.767566] Darling Mach: <5050> 2(-1): Attempting to acquire read lock: 576, ffffffffc066dae0, 0 [ 67.767567] Darling Mach: <5050> 2(-1): Acquired read lock: 576, ffffffffc066dae0, 200 [ 67.767568] Darling Mach: <5050> 2(-1): Attempting to release read lock: 578, ffffffffc066dae0, 200 [ 67.767569] Darling Mach: <5050> 2(-1): Released read lock: 578, ffffffffc066dae0, 0 [ 67.767570] Darling Mach: <5050> 2(-1): Attempting to acquire read lock: 212, ffffffffc066dae0, 0 [ 67.767570] Darling Mach: <5050> 2(-1): Acquired read lock: 212, ffffffffc066dae0, 200 [ 67.767571] Darling Mach: <5050> 2(-1): Attempting to release read lock: 216, ffffffffc066dae0, 200 [ 67.767572] Darling Mach: <5050> 2(-1): Released read lock: 216, ffffffffc066dae0, 0 [ 67.767821] Darling Mach: <1289> 0(-1): [ 67.768675] general protection fault: 0000 [#1] SMP PTI [ 67.768693] Modules linked in: overlay darling_mach(OE) binfmt_misc snd_hda_codec_generic joydev snd_hda_intel snd_hda_codec input_leds serio_raw snd_hda_core snd_hwdep snd_pcm qemu_fw_cfg snd_timer snd soundcore mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse qxl ttm virtio_blk pata_acpi drm_kms_helper virtio_net syscopyarea sysfillrect sysimgblt fb_sys_fops floppy i2c_piix4 drm [ 67.768863] CPU: 3 PID: 1763 Comm: notifyd Tainted: G OE 4.15.0-46-generic #49-Ubuntu [ 67.768882] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 67.768916] RIP: 0010:darling_task_notify_deregister+0x8a/0xd0 [darling_mach] [ 67.768933] RSP: 0018:ffffc9000291fd78 EFLAGS: 00010246 [ 67.768946] RAX: ffff880415fff660 RBX: 0000000000000001 RCX: 0000000000000000 [ 67.768962] RDX: f2dbf51898912667 RSI: 00000000f2ffa0e0 RDI: ffff880413862b40 [ 67.768977] RBP: ffffc9000291fd90 R08: 0000000000000000 R09: ffff88040f8c0098 [ 67.768993] R10: ffffc9000292b720 R11: 00000000000000af R12: ffff880415fff670 [ 67.769009] R13: ffff88040b894900 R14: 0000000000000000 R15: 00000000ffffffe8 [ 67.769026] FS: 00007ffff57ff700(0000) GS:ffff880427d80000(0000) knlGS:00007ffff2ffa0e0 [ 67.769044] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 67.769057] CR2: 00007ffff6c8e000 CR3: 0000000412710000 CR4: 00000000000006e0 [ 67.769076] Call Trace: [ 67.769093] evprocfd_create+0x13f/0x180 [darling_mach] [ 67.769113] ? set_tracer_entry+0xf0/0xf0 [darling_mach] [ 67.769132] evproc_create_entry+0x4d/0x80 [darling_mach] [ 67.769151] mach_dev_ioctl+0x1d3/0x230 [darling_mach] [ 67.769165] ? eventfd_write+0xd8/0x270 [ 67.769177] do_vfs_ioctl+0xa8/0x630 [ 67.769186] ? vfs_write+0x166/0x1a0 [ 67.769196] SyS_ioctl+0x79/0x90 [ 67.769206] do_syscall_64+0x73/0x130 [ 67.769218] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 67.769230] RIP: 0033:0x7ffff7cf3ff3 [ 67.769240] RSP: 002b:00007ffff2ff78a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 67.769257] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ffff7cf3ff3 [ 67.769273] RDX: 00007ffff2ff7950 RSI: 0000000000001023 RDI: 00000000000003ff [ 67.769288] RBP: 00007ffff2ff78e0 R08: 0000000000000000 R09: 0000000000000000 [ 67.769304] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffff57fef00 [ 67.769320] R13: 0000000000000000 R14: 00007ffff2ff7d10 R15: 00007ffff67cf800 [ 67.769336] Code: 48 83 c3 60 e8 c8 f5 ff ff 4c 89 e7 e8 30 0f 36 c1 48 8b 3b 48 39 df 74 45 4c 3b 6f 10 75 37 48 8b 47 08 48 8b 17 bb 01 00 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 07 48 [ 67.769405] RIP: darling_task_notify_deregister+0x8a/0xd0 [darling_mach] RSP: ffffc9000291fd78 [ 67.769445] ---[ end trace 1336bdf1df231061 ]--- [ 67.769888] Darling Mach: <5050> 2(-1): Attempting to acquire read lock: 223, ffffffffc066dae0, 0 [ 67.769914] Darling Mach: <5050> 2(-1): Acquired read lock: 223, ffffffffc066dae0, 200 [ 67.769932] Darling Mach: <5050> 2(-1): Attempting to release read lock: 225, ffffffffc066dae0, 200 [ 67.769953] Darling Mach: <5050> 2(-1): Released read lock: 225, ffffffffc066dae0, 0 [ 67.770613] Darling Mach: <5050> 2(-1): Attempting to acquire read lock: 223, ffffffffc066dae0, 0 [ 67.771190] Darling Mach: <5050> 2(-1): Acquired read lock: 223, ffffffffc066dae0, 200
[ 49.521868] Darling Mach: <1380> 3(-1): Attempting to acquire read lock: 591, ffffffffc066cae0, 0 [ 49.521869] Darling Mach: <1380> 3(-1): Acquired read lock: 591, ffffffffc066cae0, 200 [ 49.521871] Darling Mach: <1380> 3(-1): Attempting to release read lock: 595, ffffffffc066cae0, 200 [ 49.521872] Darling Mach: <1380> 3(-1): Released read lock: 595, ffffffffc066cae0, 0 [ 49.525350] about to call wait_queue_wakeup64_one_locked - ipc_mqueue_release_msgcount [ 49.525350] BUG: wait_queue_wakeup64_one_locked () called [ 49.527543] Darling Mach: <1380> 3(-1): Attempting to acquire read lock: 591, ffffffffc066cae0, 0 [ 49.527544] Darling Mach: <1380> 3(-1): Acquired read lock: 591, ffffffffc066cae0, 200 [ 49.527546] Darling Mach: <1380> 3(-1): Attempting to release read lock: 595, ffffffffc066cae0, 200 [ 49.527547] Darling Mach: <1380> 3(-1): Released read lock: 595, ffffffffc066cae0, 0 [ 49.621435] RBP: ffffc90003587e40 R08: 0000000000000000 R09: ffff8804100dab98 [ 49.621956] R10: 0000000000000000 R11: 00000000000003b2 R12: ffff88040afed770 [ 49.622474] R13: ffff88041420fb00 R14: ffff88041420fb00 R15: ffff88040fc2d480 [ 49.622989] FS: 00007ffff3ffd700(0000) GS:ffff880427c80000(0000) knlGS:00007ffff2f770e0 [ 49.623536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.624057] CR2: 0000000000000010 CR3: 000000041326e000 CR4: 00000000000006e0 [ 49.624589] Call Trace: [ 49.625123] evprocfd_release+0x26/0x70 [darling_mach] [ 49.625633] fput+0xea/0x220 [ 49.626134] __fput+0xe/0x10 [ 49.626629] task_work_run+0x9d/0xc0 [ 49.627125] exit_to_usermode_loop+0xc0/0xd0 [ 49.627644] do_syscall_64+0x115/0x130 [ 49.628145] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 49.628656] RIP: 0033:0x7ffff7cf3ff3 [ 49.629184] RSP: 002b:00007ffff2f747d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 49.629698] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00007ffff7cf3ff3 [ 49.630181] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000253 [ 49.630660] RBP: 00007ffff2f74800 R08: 0000000000000000 R09: 0000000000000000 [ 49.631155] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffff3ffcf00 [ 49.631654] R13: 0000000000000000 R14: 00007ffff2f74d10 R15: 00007ffff67cf800 [ 49.632132] Code: c3 7e 06 48 8b 5b 08 eb cf 4c 8d 63 70 bf 6e 02 00 00 48 83 c3 60 e8 c8 f5 ff ff 4c 89 e7 e8 30 1f 36 c1 48 8b 3b 48 39 df 74 45 <4c> 3b 6f 10 75 37 48 8b 47 08 48 8b 17 bb 01 00 00 00 48 89 42 [ 49.633161] RIP: darling_task_notify_deregister+0x78/0xd0 [darling_mach] RSP: ffffc90003587e28 [ 49.633666] CR2: 0000000000000010 [ 49.634239] ---[ end trace 420572fb121ba53e ]---
[ 53.658986] Darling Mach: <1328> 1(-1): Attempting to acquire read lock: 591, ffffffffc066cae0, 100000300 [ 53.664262] Darling Mach: <2274> 3(-1): Released read lock: 622, ffffffffc066cae0, 100000100 [ 53.664263] Darling Mach: <5485> 2(2): Acquired write lock: 490, ffffffffc066cae0, ff [ 53.664266] Darling Mach: <5485> 2(2): Attempting to release write lock: 510, ffffffffc066cae0, ff [ 53.664288] Darling Mach: <5485> 2(-1): Released write lock: 510, ffffffffc066cae0, 0 [ 53.664369] Darling Mach: <1343> 3(-1): Attempting to acquire read lock: 223, ffffffffc066cae0, 0 [ 53.664370] Darling Mach: <1343> 3(-1): Acquired read lock: 223, ffffffffc066cae0, 200 [ 53.664372] Darling Mach: <1343> 3(-1): Attempting to release read lock: 225, ffffffffc066cae0, 200 [ 53.664373] Darling Mach: <1343> 3(-1): Released read lock: 225, ffffffffc066cae0, 0 [ 53.666061] systemd-journald[448]: /dev/kmsg buffer overrun, some messages lost. [ 53.666394] general protection fault: 0000 [#1] SMP PTI [ 53.666414] Modules linked in: overlay darling_mach(OE) binfmt_misc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw qemu_fw_cfg mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea hid_generic sysfillrect pcbc usbhid hid aesni_intel sysimgblt aes_x86_64 crypto_simd fb_sys_fops glue_helper virtio_blk psmouse drm virtio_net i2c_piix4 pata_acpi cryptd floppy [ 53.666578] CPU: 3 PID: 2274 Comm: notifyd Tainted: G OE 4.15.0-46-generic #49-Ubuntu [ 53.666599] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 53.666632] RIP: 0010:darling_task_notify_deregister+0x8a/0xd0 [darling_mach] [ 53.666648] RSP: 0018:ffffc90002ecfd78 EFLAGS: 00010246 [ 53.666661] RAX: ffff880415146e60 RBX: 0000000000000001 RCX: 0000000000000000 [ 53.666677] RDX: f2dbf518989129c7 RSI: 00000000f2ffa0e0 RDI: ffff88040c293e00 [ 53.666693] RBP: ffffc90002ecfd90 R08: 0000000000000000 R09: ffff88041471c118 [ 53.666709] R10: ffffc9000298fc68 R11: 0000000000000102 R12: ffff880415146e70 [ 53.666724] R13: ffff88041342c600 R14: 0000000000000000 R15: 00000000ffffffe8 [ 53.666741] FS: 00007ffff57ff700(0000) GS:ffff880427d80000(0000) knlGS:00007ffff2ffa0e0 [ 53.666759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.666772] CR2: 00007ffff6c8e000 CR3: 000000040baf8000 CR4: 00000000000006e0 [ 53.666792] Call Trace: [ 53.666809] evprocfd_create+0x13f/0x180 [darling_mach] [ 53.666829] ? set_tracer_entry+0xf0/0xf0 [darling_mach] [ 53.666848] evproc_create_entry+0x4d/0x80 [darling_mach] [ 53.666867] mach_dev_ioctl+0x1d3/0x230 [darling_mach] [ 53.666881] ? eventfd_write+0xd8/0x270 [ 53.666894] do_vfs_ioctl+0xa8/0x630 [ 53.666904] ? vfs_write+0x166/0x1a0 [ 53.666927] SyS_ioctl+0x79/0x90 [ 53.666937] do_syscall_64+0x73/0x130 [ 53.666948] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 53.666960] RIP: 0033:0x7ffff7cf3ff3 [ 53.666969] RSP: 002b:00007ffff2ff78a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 53.666986] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ffff7cf3ff3 [ 53.667002] RDX: 00007ffff2ff7950 RSI: 0000000000001023 RDI: 00000000000003ff [ 53.667018] RBP: 00007ffff2ff78e0 R08: 0000000000000000 R09: 0000000000000000 [ 53.667034] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffff57fef00 [ 53.667050] R13: 0000000000000000 R14: 00007ffff2ff7d10 R15: 00007ffff67cf800 [ 53.667066] Code: 48 83 c3 60 e8 c8 f5 ff ff 4c 89 e7 e8 30 1f 36 c1 48 8b 3b 48 39 df 74 45 4c 3b 6f 10 75 37 48 8b 47 08 48 8b 17 bb 01 00 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 07 48 [ 53.667136] RIP: darling_task_notify_deregister+0x8a/0xd0 [darling_mach] RSP: ffffc90002ecfd78 [ 53.667156] ---[ end trace 6e8c33b925432981 ]---
Also - I'd like to know if you've been able to replicate my findings. A tight "while true; do date; done", running in N+1 processes should work (N=number of cores)
No. I tried to, yesterday, but it didn't hang. But then lldb didn't hang either, so perhaps the additional printk()
s I've added were preventing the race condition.
Ah, that's not what I wanted to hear. I've replicated it on two machines (rather, both machines) that I've tried on. Three if you include the VM.
fwiw the crashes above were inside of the list_add and list_del macros inside darling_task_notify_register and darling_task_notify_deregister. This is why I extended the my_task_lock to cover manipulation of the list - but whether that was the correct thing to do is beyond me. I'm glad LubosD knows how the project functions. My exposure is purely inside task_registry.c and it relies on its callers acting safely, which is where my concern comes from.
After multiple failed attempts I started undoing system changes. I placed Xcode back into /Applications and ran my initial stress test. I was able to replicate the hang again using my initial setup. Here's the command I run:
while true; do /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk -miphoneos-version-min=12.1 -o /tmp/xx.$$ -c test.c; done
It basically just compiles a test program over and over again (test.c's main just returns 0)
I also adjusted rsyslog to log to file again - previously I stopped this because it was filling my disk too quickly.
The code I'm running is directly from master, so this is replicates my initial test. I'm going to poke around a bit to see where things start to work. I suspect that disk io plays a part, probably in helping the race fail somehow.
Here's a backtrace of the hang.
(gdb) thread 1 [Switching to thread 1 (Thread 1)]
81 return val * GOLDEN_RATIO_64 >> (64 - bits); (gdb) bt
Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) thread 2 [Switching to thread 2 (Thread 2)]
189 { (gdb) bt
Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) thread 4 [Switching to thread 4 (Thread 4)]
81 return val * GOLDEN_RATIO_64 >> (64 - bits); (gdb) bt
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Test setup
Steps to repeat
Within multiple 'darling shell's (I had to spawn 13 on my latest failure) continuously compile a test file.
My command was: cd /tmp/ ; while true; do /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -arch arm -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk -miphoneos-version-min=12.1 -o /tmp/xx.$$ -c /tmp/test.c; done
test.c contained:
include
int main() { return 0; }
Eventually the entire VM will hang. This means that none of my 13 ssh sessions would respond and the machine did not respond via other means. Attaching to the kernel using gdb from the host reveals the following:
(gdb) target remote :1234 Remote debugging using :1234
Thread 1 received signal SIGTRAP, Trace/breakpoint trap. kmmio_fault (addr=, regs=) at /build/linux-uQJ2um/linux-4.15.0/arch/x86/mm/fault.c:38
38 /build/linux-uQJ2um/linux-4.15.0/arch/x86/mm/fault.c: No such file or directory.
(gdb) c
Continuing.