TritonDataCenter / illumos-joyent

Community developed and maintained version of the OS/Net consolidation
http://www.illumos.org/projects/illumos-gate
265 stars 111 forks source link

Panic: lx_close() -> signalfd_pollers_dissociate() `assertion failed: p->p_sigfd != ((void *)0)` #434

Closed smokris closed 1 year ago

smokris commented 1 year ago

This morning, one of our systems panicked for the first time in 6 months:

> ::status
debugging crash dump vmcore.18 (64-bit) from widget.local
operating system: 5.11 joyent_20230209T001143Z (i86pc)
git branch: release-20230209
git rev: 1e01b48aaf9da7510b29df5a297ab8bfb5a6a701
image uuid: (not set)
panic message: assertion failed: p->p_sigfd != ((void *)0) (0x0 != 0x0), file: ../../common/io/signalfd.c, line: 408
dump content: kernel pages only

> ::stack
vpanic()
0xfffffffffba4a1f1()
signalfd_pollers_dissociate+0x164(fffffe1f63313d88)
signalfd_close+0x39(12500000003, 2081, 2, fffffe1f70f75b00)
dev_close+0x27(12500000003, 2081, 2, fffffe1f70f75b00)
device_close+0xc0(fffffe1f65e42d80, 2081, fffffe1f70f75b00)
spec_close+0x19d(fffffe1f65e42d80, 2081, 1, 80, fffffe1f70f75b00, 0)
fop_close+0x66(fffffe1f65e42d80, 2081, 1, 80, fffffe1f70f75b00, 0)
closef+0x68(fffffe1f68ffb258)
closeandsetf+0x1d9(a, 0)
close+0x10(a)
lx_close+0xe(a)
lx_syscall_enter+0x1aa()
sys_syscall+0x145()
::msgbuf ```console > ::msgbuf MESSAGE SmartOS Version joyent_20230209T001143Z 64-bit Copyright 2022-2023 MNX Cloud, Inc. x86_feature: lgpg x86_feature: tsc x86_feature: msr x86_feature: mtrr x86_feature: pge x86_feature: de x86_feature: cmov x86_feature: mmx x86_feature: mca x86_feature: pae x86_feature: cv8 x86_feature: pat x86_feature: sep x86_feature: sse x86_feature: sse2 x86_feature: htt x86_feature: asysc x86_feature: nx x86_feature: sse3 x86_feature: cx16 x86_feature: cmp x86_feature: tscp x86_feature: mwait x86_feature: cpuid x86_feature: ssse3 x86_feature: sse4_1 x86_feature: sse4_2 x86_feature: clfsh x86_feature: 64 x86_feature: vmx x86_feature: ibrs x86_feature: ibpb x86_feature: stibp x86_feature: ssbd x86_feature: flush_cmd x86_feature: core_thermal x86_feature: lfence_serializing mem = 23058560K (0x57f620000) TSC calibrated using PIT; freq is 1599 MHz ACPI: RSDP 0x00000000000FA9C0 000024 (v02 ACPIAM) ACPI: XSDT 0x00000000DF790100 00007C (v01 SMCI 20131107 MSFT 00000097) ACPI: FACP 0x00000000DF790290 0000F4 (v03 110713 FACP1130 20131107 MSFT 00000097) Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20180629/tbfadt-750) ACPI: DSDT 0x00000000DF790630 0067B2 (v01 1F580 1F580000 00000000 INTL 20051117) ACPI: FACS 0x00000000DF79E000 000040 ACPI: APIC 0x00000000DF790390 0000D2 (v01 110713 APIC1130 20131107 MSFT 00000097) ACPI: MCFG 0x00000000DF790470 00003C (v01 110713 OEMMCFG 20131107 MSFT 00000097) ACPI: OEMB 0x00000000DF79E040 00007B (v01 110713 OEMB1130 20131107 MSFT 00000097) ACPI: HPET 0x00000000DF79A630 000038 (v01 110713 OEMHPET 20131107 MSFT 00000097) ACPI: DMAR 0x00000000DF79E0C0 000128 (v01 AMI OEMDMAR 00000001 MSFT 00000097) ACPI: SSDT 0x00000000DF79E6D0 000363 (v01 DpgPmm CpuPm 00000012 INTL 20051117) ACPI: EINJ 0x00000000DF79A670 000130 (v01 AMIER AMI_EINJ 20131107 MSFT 00000097) ACPI: BERT 0x00000000DF79A800 000030 (v01 AMIER AMI_BERT 20131107 MSFT 00000097) ACPI: ERST 0x00000000DF79A830 000210 (v01 AMIER AMI_ERST 20131107 MSFT 00000097) ACPI: HEST 0x00000000DF79AA40 0000A8 (v01 AMIER ABC_HEST 20131107 MSFT 00000097) ACPI: 2 ACPI AML tables successfully acquired and loaded ACPI: Enabled 1 GPEs in block 00 to 3F SMBIOS v2.6 loaded (1876 bytes) Skipping psm: xpv_psm root nexus = i86pc NOTICE: iommulib_nexus_register: rootnex-1: Succesfully registered NEXUS i86pc nexops=fffffffffbd81fe0 pseudo0 at root pseudo0 is /pseudo scsi_vhci0 at root scsi_vhci0 is /scsi_vhci NOTICE: reprogram io-range on ppb[0/1/0]: 0x2000 ~ 0x2fff NOTICE: reprogram mem-range on ppb[0/1/0]: 0xf0000000 ~ 0xf00fffff NOTICE: reprogram io-range on ppb[0/3/0]: 0x3000 ~ 0x3fff NOTICE: reprogram mem-range on ppb[0/3/0]: 0xf0100000 ~ 0xf01fffff NOTICE: reprogram io-range on ppb[0/5/0]: 0x4000 ~ 0x4fff NOTICE: reprogram mem-range on ppb[0/5/0]: 0xf0200000 ~ 0xf02fffff NOTICE: reprogram io-range on ppb[0/7/0]: 0x5000 ~ 0x5fff NOTICE: reprogram mem-range on ppb[0/7/0]: 0xf0300000 ~ 0xf03fffff NOTICE: reprogram io-range on ppb[0/9/0]: 0x6000 ~ 0x6fff NOTICE: reprogram mem-range on ppb[0/9/0]: 0xf0400000 ~ 0xf04fffff Reading Intel IOMMU boot options npe0 at root: space 0 offset 0 npe0 is /pci@0,0 PCI Express-device: isa@1f, isa0 NOTICE: apic: local nmi: 255 0x0 1 NOTICE: apic: Using ACPI (MADT) for SMP configuration NOTICE: apic: Using APIC interrupt routing mode NOTICE: amd_iommu: No AMD IOMMU ACPI IVRS table pseudo-device: acpippm0 acpippm0 is /pseudo/acpippm@0 pseudo-device: ppm0 ppm0 is /pseudo/ppm@0 ramdisk0 at root ramdisk0 is /ramdisk root on /ramdisk:a fstype ufs ACPI: Dynamic OEM Table Load: ACPI: SSDT 0xFFFFFE1F2F1D6A88 0004D5 (v01 PmRef P001Cst 00003001 INTL 20051117) NOTICE: SpeedStep support is being disabled due to errors parsing ACPI P-state objects exported by BIOS. acpinex0 at root acpinex0 is /fw pseudo-device: dld0 dld0 is /pseudo/dld@0 PCI Express-device: pci8086,3a40@1c, pcieb0 pcieb0 is /pci@0,0/pci8086,3a40@1c PCI Express-device: pci8086,3a42@1c,1, pcieb1 pcieb1 is /pci@0,0/pci8086,3a42@1c,1 PCI Express-device: pci8086,244e@1e, pci_pci0 pci_pci0 is /pci@0,0/pci8086,244e@1e PCI Express-device: pci15d9,f580@1a,7, ehci0 ehci0 is /pci@0,0/pci15d9,f580@1a,7 PCI Express-device: pci15d9,f580@1d,7, ehci1 ehci1 is /pci@0,0/pci15d9,f580@1d,7 PCI Express-device: pci15d9,f580@1a, uhci0 uhci0 is /pci@0,0/pci15d9,f580@1a PCI Express-device: pci15d9,f580@1a,1, uhci1 uhci1 is /pci@0,0/pci15d9,f580@1a,1 PCI Express-device: pci15d9,f580@1a,2, uhci2 uhci2 is /pci@0,0/pci15d9,f580@1a,2 PCI Express-device: pci15d9,f580@1d, uhci3 uhci3 is /pci@0,0/pci15d9,f580@1d PCI Express-device: pci15d9,f580@1d,1, uhci4 uhci4 is /pci@0,0/pci15d9,f580@1d,1 PCI Express-device: pci15d9,f580@1d,2, uhci5 uhci5 is /pci@0,0/pci15d9,f580@1d,2 8042 device: keyboard@0, kb8042 # 0 kb80420 is /pci@0,0/isa@1f/i8042@1,60/keyboard@0 cpu0: x86 (chipid 0x0 GenuineIntel 106A5 family 6 model 26 step 5 clock 1600 MHz) cpu0: Intel(r) Xeon(r) CPU E5506 @ 2.13GHz KPTI enabled (PCID not supported, INVPCID not supported) cpu1: microcode has been updated from version 0x16 to 0x1d cpu1: x86 (chipid 0x0 GenuineIntel 106A5 family 6 model 26 step 5 clock 1600 MHz) cpu1: Intel(r) Xeon(r) CPU E5506 @ 2.13GHz cpu1 initialization complete - online cpu2: microcode has been updated from version 0x16 to 0x1d cpu2: x86 (chipid 0x0 GenuineIntel 106A5 family 6 model 26 step 5 clock 1600 MHz) cpu2: Intel(r) Xeon(r) CPU E5506 @ 2.13GHz cpu2 initialization complete - online cpu3: microcode has been updated from version 0x16 to 0x1d cpu3: x86 (chipid 0x0 GenuineIntel 106A5 family 6 model 26 step 5 clock 1600 MHz) cpu3: Intel(r) Xeon(r) CPU E5506 @ 2.13GHz cpu3 initialization complete - online USB 2.0 device (usb781,5571) operating at hi speed (USB 2.x) on USB 2.0 root hub: storage@2, scsa2usb0 at bus address 2 SanDisk Firebird USB Flash Drive 4C532000031128119340 scsa2usb0 is /pci@0,0/pci15d9,f580@1a,7/storage@2 /pci@0,0/pci15d9,f580@1a,7/storage@2 (scsa2usb0) online sd0 at scsa2usb0: target 0 lun 0 sd0 is /pci@0,0/pci15d9,f580@1a,7/storage@2/disk@0,0 /pci@0,0/pci15d9,f580@1a,7/storage@2/disk@0,0 (sd0) online pseudo-device: audio0 audio0 is /pseudo/audio@0 USB 2.0 device (usb51d,2) operating at full speed (USB 1.x) on USB 1.10 root hub: input@1, ugen0 at bus address 2 American Power Conversion Back-UPS XS 1500M FW:947.d10 .D USB FW:d10 3B1903X69241 ugen0 is /pci@0,0/pci15d9,f580@1a/input@1 /pci@0,0/pci15d9,f580@1a/input@1 (ugen0) online pseudo-device: stmf_sbd0 stmf_sbd0 is /pseudo/stmf_sbd@0 pseudo-device: lofi0 lofi0 is /pseudo/lofi@0 pseudo-device: lofi1 lofi1 is /pseudo/lofi@1 /pseudo/lofi@1 (lofi1) online pseudo-device: dtrace0 dtrace0 is /pseudo/dtrace@0 pseudo-device: ucode0 ucode0 is /pseudo/ucode@0 pseudo-device: devinfo0 devinfo0 is /pseudo/devinfo@0 pci0 at root: space f7 offset 0 pci0 is /pci@f7,0 iscsi0 at root iscsi0 is /iscsi acpinex: sb@0, acpinex1 acpinex1 is /fw/sb@0 pseudo-device: pseudo1 pseudo1 is /pseudo/zconsnex@1 pseudo-device: pseudo2 pseudo2 is /pseudo/zfdnex@2 ISA-device: pit_beep0 pit_beep0 is /pci@0,0/isa@1f/pit_beep IDE device at targ 0, lun 0 lastlun 0x0 model HUS726060ALA640 ATA/ATAPI-9 supported, majver 0x3fc minver 0x29 IDE device at targ 0, lun 0 lastlun 0x0 model HUS726060ALA640 ATA/ATAPI-9 supported, majver 0x3fc minver 0x29 PCI Express-device: ide@1, ata1 ata1 is /pci@0,0/pci-ide@1f,5/ide@1 UltraDMA mode 5 selected IDE device at targ 1, lun 0 lastlun 0x0 model HUS726060ALA640 ATA/ATAPI-9 supported, majver 0x3fc minver 0x29 PCI Express-device: ide@0, ata2 ata2 is /pci@0,0/pci-ide@1f,2/ide@0 Disk0: IDE device at targ 1, lun 0 lastlun 0x0 model HUS726060ALA640 ATA/ATAPI-9 supported, majver 0x3fc minver 0x29 PCI Express-device: ide@1, ata3 ata3 is /pci@0,0/pci-ide@1f,2/ide@1 UltraDMA mode 5 selected UltraDMA mode 5 selected UltraDMA mode 5 selected Disk2: Disk1: pseudo-device: llc10 llc10 is /pseudo/llc1@0 pseudo-device: power0 power0 is /pseudo/power@0 pseudo-device: ramdisk1024 ramdisk1024 is /pseudo/ramdisk@1024 pseudo-device: zfs0 zfs0 is /pseudo/zfs@0 cmdk0 at ata1 target 0 lun 0 cmdk0 is /pci@0,0/pci-ide@1f,5/ide@1/cmdk@0,0 pseudo-device: srn0 srn0 is /pseudo/srn@0 pseudo-device: dcpc0 dcpc0 is /pseudo/dcpc@0 pseudo-device: fasttrap0 fasttrap0 is /pseudo/fasttrap@0 cmdk1 at ata2 target 0 lun 0 cmdk1 is /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0 Disk3: cmdk2 at ata3 target 1 lun 0 cmdk2 is /pci@0,0/pci-ide@1f,2/ide@1/cmdk@1,0 pseudo-device: fbt0 fbt0 is /pseudo/fbt@0 pseudo-device: profile0 profile0 is /pseudo/profile@0 pseudo-device: lockstat0 lockstat0 is /pseudo/lockstat@0 pseudo-device: sdt0 sdt0 is /pseudo/sdt@0 pseudo-device: systrace0 systrace0 is /pseudo/systrace@0 pseudo-device: fcp0 fcp0 is /pseudo/fcp@0 pseudo-device: fcsm0 fcsm0 is /pseudo/fcsm@0 pseudo-device: ipd0 ipd0 is /pseudo/ipd@0 pseudo-device: stmf0 stmf0 is /pseudo/stmf@0 cmdk3 at ata2 target 1 lun 0 cmdk3 is /pci@0,0/pci-ide@1f,2/ide@0/cmdk@1,0 pseudo-device: fssnap0 fssnap0 is /pseudo/fssnap@0 pseudo-device: kvm0 kvm0 is /pseudo/kvm@0 pseudo-device: pool0 pool0 is /pseudo/pool@0 IP Filter: v4.1.9, running. pseudo-device: bpf0 bpf0 is /pseudo/bpf@0 pseudo-device: pm0 pm0 is /pseudo/pm@0 NOTICE: e1000g0 registered NOTICE: e1000g1 registered pseudo-device: nsmb0 nsmb0 is /pseudo/nsmb@0 Universal TUN/TAP device driver ver 1.3.0 02/09/2023 (C) 1999-2000 Maxim Krasnyansky pseudo-device: tap0 tap0 is /pseudo/tap@0 Universal TUN/TAP device driver ver 1.3.0 02/09/2023 (C) 1999-2000 Maxim Krasnyansky pseudo-device: tun0 tun0 is /pseudo/tun@0 SMBIOS type 0x0, addr 0x0 WARNING: SMBIOS: Invalid base address pseudo-device: lx_systrace0 lx_systrace0 is /pseudo/lx_systrace@0 pseudo-device: inotify0 inotify0 is /pseudo/inotify@0 pseudo-device: eventfd0 eventfd0 is /pseudo/eventfd@0 pseudo-device: timerfd0 timerfd0 is /pseudo/timerfd@0 pseudo-device: signalfd0 signalfd0 is /pseudo/signalfd@0 vmx_init: processor does not support desired secondary processor-based controls pseudo-device: viona0 viona0 is /pseudo/viona@0 dump on /dev/zvol/dsk/zones/dump size 32768 MB device pciclass,030000@4(display#0) keeps up device sd@0,0(disk#0), but the former is not power managed NOTICE: vnic1000 registered NOTICE: vnic1000 link up, 0 Mbps, unknown duplex NOTICE: e1000g1 link up, 100 Mbps, full duplex Creating /etc/devices/devid_cache Creating /etc/devices/pci_unitaddr_persistent pseudo-device: devinfo0 devinfo0 is /pseudo/devinfo@0 Creating /etc/devices/devname_cache NOTICE: e1000g1 link down NOTICE: vnic1007 registered NOTICE: vnic1007 link up, 0 Mbps, unknown duplex NOTICE: e1000g1 link up, 100 Mbps, full duplex sd0 at scsa2usb0: target 0 lun 0 sd0 is /pci@0,0/pci15d9,f580@1a,7/storage@2/disk@0,0 device pciclass,030000@4(display#0) keeps up device sd@0,0(disk#0), but the former is not power managed pseudo-device: llc10 llc10 is /pseudo/llc1@0 pseudo-device: ramdisk1024 ramdisk1024 is /pseudo/ramdisk@1024 pseudo-device: ucode0 ucode0 is /pseudo/ucode@0 pseudo-device: dcpc0 dcpc0 is /pseudo/dcpc@0 pseudo-device: fbt0 fbt0 is /pseudo/fbt@0 pseudo-device: profile0 profile0 is /pseudo/profile@0 pseudo-device: lockstat0 lockstat0 is /pseudo/lockstat@0 pseudo-device: sdt0 sdt0 is /pseudo/sdt@0 pseudo-device: systrace0 systrace0 is /pseudo/systrace@0 pseudo-device: fcp0 fcp0 is /pseudo/fcp@0 pseudo-device: fcsm0 fcsm0 is /pseudo/fcsm@0 pseudo-device: ipd0 ipd0 is /pseudo/ipd@0 pseudo-device: stmf0 stmf0 is /pseudo/stmf@0 pseudo-device: fssnap0 fssnap0 is /pseudo/fssnap@0 pseudo-device: kvm0 kvm0 is /pseudo/kvm@0 pseudo-device: bpf0 bpf0 is /pseudo/bpf@0 pseudo-device: pm0 pm0 is /pseudo/pm@0 pseudo-device: nsmb0 nsmb0 is /pseudo/nsmb@0 pseudo-device: lx_systrace0 lx_systrace0 is /pseudo/lx_systrace@0 pseudo-device: inotify0 inotify0 is /pseudo/inotify@0 pseudo-device: signalfd0 signalfd0 is /pseudo/signalfd@0 pseudo-device: viona0 viona0 is /pseudo/viona@0 /pseudo/zconsnex@1/zcons@0 (zcons0) online NOTICE: vnic1011 registered NOTICE: vnic1011 link up, 100 Mbps, unknown duplex NOTICE: vnic1011 unregistered /pseudo/zconsnex@1/zcons@0 (zcons0) online NOTICE: vnic1015 registered NOTICE: vnic1015 link up, 0 Mbps, unknown duplex /pseudo/zconsnex@1/zcons@1 (zcons1) online NOTICE: vnic1019 registered NOTICE: vnic1019 link up, 100 Mbps, unknown duplex NOTICE: vnic1019 unregistered panic[cpu0]/thread=fffffe1f48a2c480: assertion failed: p->p_sigfd != ((void *)0) (0x0 != 0x0), file: ../../common/io/signalfd.c, line: 408 fffffe002bd81b70 fffffffffba4a1f1 () fffffe002bd81bc0 signalfd:signalfd_pollers_dissociate+164 () fffffe002bd81c00 signalfd:signalfd_close+39 () fffffe002bd81c30 genunix:dev_close+27 () fffffe002bd81c80 specfs:device_close+c0 () fffffe002bd81d00 specfs:spec_close+19d () fffffe002bd81d80 genunix:fop_close+66 () fffffe002bd81dc0 genunix:closef+68 () fffffe002bd81e30 genunix:closeandsetf+1d9 () fffffe002bd81e50 genunix:close+10 () fffffe002bd81e70 lx_brand:lx_close+e () fffffe002bd81ef0 lx_brand:lx_syscall_enter+1aa () fffffe002bd81f10 unix:brand_sys_syscall+1c6 () ```

(I can't share the full vmdump since it contains user data, but I'd be happy to run additional mdb commands if needed.)

danmcd commented 1 year ago

This has something to do with illumos#14892. I think the VERIFY() there may be too strong. I'll have to consult with Patrick about it.

danmcd commented 1 year ago

IFF (sic.) the VERIFY() is indeed too strong (i.e. p->p_sigfd can legitimately be NULL here), here's the fix:

diff --git a/usr/src/uts/common/io/signalfd.c b/usr/src/uts/common/io/signalfd.c
index 601afce398..c9be797f58 100644
--- a/usr/src/uts/common/io/signalfd.c
+++ b/usr/src/uts/common/io/signalfd.c
@@ -404,10 +404,11 @@ signalfd_pollers_dissociate(signalfd_state_t *state)
                }
                VERIFY3P(sp->sp_proc, ==, p);
                VERIFY3P(sp->sp_state, ==, state);
-               VERIFY3P(p->p_sigfd, !=, NULL);

-               sigfd_proc_state_t *pstate = p->p_sigfd;
-               list_remove(&pstate->sigfd_list, sp);
+               if (p->p_sigfd != NULL) {
+                       sigfd_proc_state_t *pstate = p->p_sigfd;
+                       list_remove(&pstate->sigfd_list, sp);
+               }
                sp->sp_proc = NULL;

                /* Wake any lingering pollers referencing the pollhead */

Otherwise the fix will be more complicated.

danmcd commented 1 year ago

https://www.illumos.org/issues/15386

danmcd commented 1 year ago

Fixed in -gate and in 20230209 respin.

smokris commented 1 year ago

Thanks, @danmcd!

Via email, Dan provided MDB commands that identified that the process leading to this crash was systemd-journald. I confirmed that I was able to reproduce the panic on 20230209T001143Z by running systemctl restart systemd-journald in the LX zone. After updating to 20230214T155114Z, I'm no longer able to reproduce the panic.