freebsd / drm-kmod

drm driver for FreeBSD
152 stars 68 forks source link

amdgpu_dm_irq_schedule_work FAILED src 5 #59

Closed pkubaj closed 2 years ago

pkubaj commented 3 years ago

Describe the bug I have RX560 running on 13.0-BETA3 on amd64 workstation.

When my monitors are off, I'm getting the following errors on the console: <6>[drm] amdgpu_dm_irq_schedule_work FAILED src 5

When it happens, X11 stops working, but I can still log in via SSH. Shortly after that, SSH and console logging stops working as well.

Restarting X11, in the time window where I can still log in, doesn't help. Xorg doesn't start at all. Only reboot helps in that case.

FreeBSD version FreeBSD KGPE-D16 13.0-BETA3 FreeBSD 13.0-BETA3 #0 releng/13.0-n244525-150b4388d3b: Fri Feb 19 01:36:53 CET 2021 root@KGPE-D16:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64

PCI Info

pciconf -lv hostb0@pci0:0:0:0: class=0x060000 rev=0x02 hdr=0x00 vendor=0x1002 device=0x5a10 subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890 Northbridge only dual slot (2x16) PCI-e GFX Hydra part' class = bridge subclass = HOST-PCI none0@pci0:0:0:2: class=0x080600 rev=0x00 hdr=0x00 vendor=0x1002 device=0x5a23 subvendor=0x1002 subdevice=0x5a23 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890S/RD990 I/O Memory Management Unit (IOMMU)' class = base peripheral subclass = IOMMU pcib1@pci0:0:2:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a16 subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GFX port 0)' class = bridge subclass = PCI-PCI pcib2@pci0:0:4:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a18 subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 0)' class = bridge subclass = PCI-PCI pcib3@pci0:0:9:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a1c subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 4)' class = bridge subclass = PCI-PCI pcib4@pci0:0:10:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a1d subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 5)' class = bridge subclass = PCI-PCI pcib5@pci0:0:11:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a1f subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD990 PCI to PCI bridge (PCI Express GFX2 port 0)' class = bridge subclass = PCI-PCI pcib6@pci0:0:12:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a20 subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD990 PCI to PCI bridge (PCI Express GFX2 port 1)' class = bridge subclass = PCI-PCI pcib7@pci0:0:13:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x5a1e subvendor=0x1002 subdevice=0x5a10 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP2 Port 0)' class = bridge subclass = PCI-PCI ahci0@pci0:0:17:0: class=0x010601 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4394 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]' class = mass storage subclass = SATA ohci0@pci0:0:18:0: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4397 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' class = serial bus subclass = USB ohci1@pci0:0:18:1: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4398 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0 USB OHCI1 Controller' class = serial bus subclass = USB ehci0@pci0:0:18:2: class=0x0c0320 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4396 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB EHCI Controller' class = serial bus subclass = USB ohci2@pci0:0:19:0: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4397 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' class = serial bus subclass = USB ohci3@pci0:0:19:1: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4398 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0 USB OHCI1 Controller' class = serial bus subclass = USB ehci1@pci0:0:19:2: class=0x0c0320 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4396 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB EHCI Controller' class = serial bus subclass = USB intsmb0@pci0:0:20:0: class=0x0c0500 rev=0x3d hdr=0x00 vendor=0x1002 device=0x4385 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SBx00 SMBus Controller' class = serial bus subclass = SMBus atapci0@pci0:0:20:1: class=0x01018a rev=0x00 hdr=0x00 vendor=0x1002 device=0x439c subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 IDE Controller' class = mass storage subclass = ATA hdac2@pci0:0:20:2: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4383 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SBx00 Azalia (Intel HDA)' class = multimedia subclass = HDA isab0@pci0:0:20:3: class=0x060100 rev=0x00 hdr=0x00 vendor=0x1002 device=0x439d subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 LPC host controller' class = bridge subclass = PCI-ISA pcib8@pci0:0:20:4: class=0x060401 rev=0x00 hdr=0x01 vendor=0x1002 device=0x4384 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SBx00 PCI to PCI Bridge' class = bridge subclass = PCI-PCI ohci4@pci0:0:20:5: class=0x0c0310 rev=0x00 hdr=0x00 vendor=0x1002 device=0x4399 subvendor=0x1043 subdevice=0x8163 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'SB7x0/SB8x0/SB9x0 USB OHCI2 Controller' class = serial bus subclass = USB hostb1@pci0:0:24:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1600 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 0' class = bridge subclass = HOST-PCI hostb2@pci0:0:24:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1601 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 1' class = bridge subclass = HOST-PCI hostb3@pci0:0:24:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1602 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 2' class = bridge subclass = HOST-PCI hostb4@pci0:0:24:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1603 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 3' class = bridge subclass = HOST-PCI hostb5@pci0:0:24:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1604 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 4' class = bridge subclass = HOST-PCI hostb6@pci0:0:24:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1605 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 5' class = bridge subclass = HOST-PCI hostb7@pci0:0:25:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1600 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 0' class = bridge subclass = HOST-PCI hostb8@pci0:0:25:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1601 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 1' class = bridge subclass = HOST-PCI hostb9@pci0:0:25:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1602 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 2' class = bridge subclass = HOST-PCI hostb10@pci0:0:25:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1603 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 3' class = bridge subclass = HOST-PCI hostb11@pci0:0:25:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1604 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 4' class = bridge subclass = HOST-PCI hostb12@pci0:0:25:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1605 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 5' class = bridge subclass = HOST-PCI hostb13@pci0:0:26:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1600 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 0' class = bridge subclass = HOST-PCI hostb14@pci0:0:26:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1601 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 1' class = bridge subclass = HOST-PCI hostb15@pci0:0:26:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1602 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 2' class = bridge subclass = HOST-PCI hostb16@pci0:0:26:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1603 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 3' class = bridge subclass = HOST-PCI hostb17@pci0:0:26:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1604 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 4' class = bridge subclass = HOST-PCI hostb18@pci0:0:26:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1605 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 5' class = bridge subclass = HOST-PCI hostb19@pci0:0:27:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1600 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 0' class = bridge subclass = HOST-PCI hostb20@pci0:0:27:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1601 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 1' class = bridge subclass = HOST-PCI hostb21@pci0:0:27:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1602 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 2' class = bridge subclass = HOST-PCI hostb22@pci0:0:27:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1603 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 3' class = bridge subclass = HOST-PCI hostb23@pci0:0:27:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1604 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 4' class = bridge subclass = HOST-PCI hostb24@pci0:0:27:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1605 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 15h Processor Function 5' class = bridge subclass = HOST-PCI nvme0@pci0:1:0:0: class=0x010802 rev=0x01 hdr=0x00 vendor=0x1987 device=0x5012 subvendor=0x1987 subdevice=0x5012 vendor = 'Phison Electronics Corporation' device = 'E12 NVMe Controller' class = mass storage subclass = NVM em0@pci0:3:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x10d3 subvendor=0x1043 subdevice=0x8369 vendor = 'Intel Corporation' device = '82574L Gigabit Network Connection' class = network subclass = ethernet em1@pci0:4:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x10d3 subvendor=0x1043 subdevice=0x8369 vendor = 'Intel Corporation' device = '82574L Gigabit Network Connection' class = network subclass = ethernet vgapci0@pci0:5:0:0: class=0x030000 rev=0xe5 hdr=0x00 vendor=0x1002 device=0x67ef subvendor=0x1da2 subdevice=0xe348 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]' class = display subclass = VGA hdac0@pci0:5:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0xaae0 subvendor=0x1da2 subdevice=0xaae0 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X]' class = multimedia subclass = HDA hdac1@pci0:7:0:0: class=0x040300 rev=0x04 hdr=0x00 vendor=0x1102 device=0x000b subvendor=0x1102 subdevice=0x0043 vendor = 'Creative Labs' device = 'EMU20k2 [Sound Blaster X-Fi Titanium Series]' class = multimedia subclass = HDA vgapci1@pci0:8:1:0: class=0x030000 rev=0x10 hdr=0x00 vendor=0x1a03 device=0x2000 subvendor=0x1a03 subdevice=0x2000 vendor = 'ASPEED Technology, Inc.' device = 'ASPEED Graphics Family' class = display subclass = VGA none1@pci0:8:2:0: class=0x0c0010 rev=0x70 hdr=0x00 vendor=0x11c1 device=0x5811 subvendor=0x1043 subdevice=0x8259 vendor = 'LSI Corporation' device = 'FW322/323 [TrueFire] 1394a Controller' class = serial bus subclass = FireWire emu10kx0@pci0:8:3:0: class=0x040100 rev=0x0a hdr=0x00 vendor=0x1102 device=0x0002 subvendor=0x1102 subdevice=0x8065 vendor = 'Creative Labs' device = 'EMU10k1 [Sound Blaster Live! Series]' class = multimedia subclass = audio none2@pci0:8:3:1: class=0x098000 rev=0x0a hdr=0x00 vendor=0x1102 device=0x7002 subvendor=0x1102 subdevice=0x0020 vendor = 'Creative Labs' device = 'SB Live! Game Port' class = input device

DRM KMOD version drm-fbsd13-kmod 5.4.92.g20210202

To Reproduce Steps to reproduce the behavior: With DPMS on, allow your screen to be suspended and wait some time.

Additional context I'm not sure whether it happens with one screen connected, I have two screens. Everything worked fine on 12.2-RELEASE. Disabling DPMS doesn't help. I can see that host still responds to pings even after crash. During the time I can still log in remotely, if I try to reboot, rebooting stalls after:

WARNING !(0) failed at /usr/local/sys/modules/drm-fbsd13-kmod/drivers/gpu/drm/amd/display/dc/gpio/gpio_base.c:66
#0 0xffffffff80e3da83 at linux_dump_stack+0x23
#1 0xffffffff82bd9c41 at dal_gpio_open_ex+0x31
#2 0xffffffff82bdacbf at dal_ddc_open+0x1f
#3 0xffffffff82b492e7 at dce_aux_transfer_raw+0x77
#4 0xffffffff82b109a3 at dm_dp_aux_transfer+0x93
#5 0xffffffff82d2b25f at drm_dp_dpcd_access+0xaf
#6 0xffffffff82d2b337 at drm_dp_dpcd_write+0x27
#7 0xffffffff82b0f219 at dm_helpers_dp_write_dpcd+0x29
#8 0xffffffff82b37108 at dp_enable_link_phy+0x168
#9 0xffffffff82b3ca18 at enable_link_dp+0x128
#10 0xffffffff82b3a7a1 at core_link_enable_stream+0x381
#11 0xffffffff82b5d1a9 at dce110_apply_ctx_to_hw+0x669
#12 0xffffffff82b44252 at dc_commit_state+0x3e2
#13 0xffffffff82b18a8b at amdgpu_dm_atomic_commit_tail+0x46b
#14 0xffffffff82d1b129 at commit_tail+0x49
#15 0xffffffff82d1a468 at drm_atomic_helper_commit+0x1e8
#16 0xffffffff82d1bbc9 at drm_atomic_helper_set_config+0x89
#17 0xffffffff82d27dc5 at drm_mode_setcrtc+0x325
WARNING !(0) failed at /usr/local/sys/modules/drm-fbsd13-kmod/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c:551
#0 0xffffffff80e3da83 at linux_dump_stack+0x23
#1 0xffffffff82bdacec at dal_ddc_open+0x4c
#2 0xffffffff82b492e7 at dce_aux_transfer_raw+0x77
#3 0xffffffff82b109a3 at dm_dp_aux_transfer+0x93
#4 0xffffffff82d2b25f at drm_dp_dpcd_access+0xaf
#5 0xffffffff82d2b337 at drm_dp_dpcd_write+0x27
#6 0xffffffff82b0f219 at dm_helpers_dp_write_dpcd+0x29
#7 0xffffffff82b37108 at dp_enable_link_phy+0x168
#8 0xffffffff82b3ca18 at enable_link_dp+0x128
#9 0xffffffff82b3a7a1 at core_link_enable_stream+0x381
#10 0xffffffff82b5d1a9 at dce110_apply_ctx_to_hw+0x669
#11 0xffffffff82b44252 at dc_commit_state+0x3e2
#12 0xffffffff82b18a8b at amdgpu_dm_atomic_commit_tail+0x46b
#13 0xffffffff82d1b129 at commit_tail+0x49
#14 0xffffffff82d1a468 at drm_atomic_helper_commit+0x1e8
#15 0xffffffff82d1bbc9 at drm_atomic_helper_set_config+0x89
#16 0xffffffff82d27dc5 at drm_mode_setcrtc+0x325
#17 0xffffffff82d45374 at drm_ioctl_kernel+0x74
valpackett commented 3 years ago

Very similar trace: https://lkml.org/lkml/2020/6/28/107

pkubaj commented 3 years ago

It looks like disabling DPMS does indeed workaround this issue, but xset -dpms alone is not enough. I had to do:

xset s off
xset s noblank
xset -dpms
xset dpms 0 0 0
evadot commented 2 years ago

Could you try again with 5.10 ?

pkubaj commented 2 years ago

I moved all my desktop activities to powerpc64le workstation with a different GPU. I don't encounter the reported issue there.