freebsd / drm-kmod

drm driver for FreeBSD
148 stars 68 forks source link

graphics/drm-61-kmod amdgpu panic: Unregistered use of FPU in kernel #277

Closed jownit closed 1 month ago

jownit commented 5 months ago

Describe the bug Panics when loading amdgpu.ko

FreeBSD version FreeBSD joxan 15.0-CURRENT FreeBSD 15.0-CURRENT #123 main-n267479-13720136fbf9: Wed Jan 10 09:26:10 CET 2024 root@joxan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 1500008 1500008

PCI Info

pciconf -lv hostb0@pci0:0:0:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15d0 subvendor=0x1022 subdevice=0x15d0 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Root Complex' class = bridge subclass = HOST-PCI none0@pci0:0:0:2: class=0x080600 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15d1 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 IOMMU' class = base peripheral subclass = IOMMU hostb1@pci0:0:1:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1452 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:2: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x15d3 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 PCIe GPP Bridge [6:0]' class = bridge subclass = PCI-PCI pcib2@pci0:0:1:3: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x15d3 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 PCIe GPP Bridge [6:0]' class = bridge subclass = PCI-PCI pcib3@pci0:0:1:4: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x15d3 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 PCIe GPP Bridge [6:0]' class = bridge subclass = PCI-PCI pcib4@pci0:0:1:7: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x15d3 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 PCIe GPP Bridge [6:0]' class = bridge subclass = PCI-PCI hostb2@pci0:0:8:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1452 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib5@pci0:0:8:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x15db subvendor=0x5126 subdevice=0x17aa vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A' class = bridge subclass = PCI-PCI intsmb0@pci0:0:20:0: class=0x0c0500 rev=0x61 hdr=0x00 vendor=0x1022 device=0x790b subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH SMBus Controller' class = serial bus subclass = SMBus isab0@pci0:0:20:3: class=0x060100 rev=0x51 hdr=0x00 vendor=0x1022 device=0x790e subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH LPC Bridge' class = bridge subclass = PCI-ISA hostb3@pci0:0:24:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e8 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 0' class = bridge subclass = HOST-PCI hostb4@pci0:0:24:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e9 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 1' class = bridge subclass = HOST-PCI hostb5@pci0:0:24:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15ea subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 2' class = bridge subclass = HOST-PCI hostb6@pci0:0:24:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15eb subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 3' class = bridge subclass = HOST-PCI hostb7@pci0:0:24:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15ec subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 4' class = bridge subclass = HOST-PCI hostb8@pci0:0:24:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15ed subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 5' class = bridge subclass = HOST-PCI hostb9@pci0:0:24:6: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15ee subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 6' class = bridge subclass = HOST-PCI hostb10@pci0:0:24:7: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15ef subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven/Raven2 Device 24: Function 7' class = bridge subclass = HOST-PCI iwm0@pci0:1:0:0: class=0x028000 rev=0x29 hdr=0x00 vendor=0x8086 device=0x2526 subvendor=0x8086 subdevice=0x0014 vendor = 'Intel Corporation' device = 'Wireless-AC 9260' class = network nvme0@pci0:2:0:0: class=0x010802 rev=0x00 hdr=0x00 vendor=0x15b7 device=0x5006 subvendor=0x15b7 subdevice=0x5006 vendor = 'Sandisk Corp' device = 'SanDisk Extreme Pro / WD Black SN750 / PC SN730 / Red SN700 NVMe SSD' class = mass storage subclass = NVM re0@pci0:3:0:0: class=0x020000 rev=0x0e hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet none1@pci0:3:0:1: class=0x070002 rev=0x0e hdr=0x00 vendor=0x10ec device=0x816a subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111xP UART' class = simple comms subclass = UART none2@pci0:3:0:2: class=0x070002 rev=0x0e hdr=0x00 vendor=0x10ec device=0x816b subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111xP UART' class = simple comms subclass = UART none3@pci0:3:0:3: class=0x0c0701 rev=0x0e hdr=0x00 vendor=0x10ec device=0x816c subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111xP IPMI interface' class = serial bus subclass = IPMI ehci0@pci0:3:0:4: class=0x0c0320 rev=0x0e hdr=0x00 vendor=0x10ec device=0x816d subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL811x EHCI host controller' class = serial bus subclass = USB rtsx0@pci0:4:0:0: class=0xff0000 rev=0x01 hdr=0x00 vendor=0x10ec device=0x522a subvendor=0x17aa subdevice=0x5126 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTS522A PCI Express Card Reader' vgapci0@pci0:5:0:0: class=0x030000 rev=0xd2 hdr=0x00 vendor=0x1002 device=0x15d8 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile Series]' class = display subclass = VGA hdac0@pci0:5:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0x15de subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Raven/Raven2/Fenghuang HDMI/DP Audio Controller' class = multimedia subclass = HDA none4@pci0:5:0:2: class=0x108000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15df subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 10h-1fh) Platform Security Processor' class = encrypt/decrypt xhci0@pci0:5:0:3: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e0 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven USB 3.1' class = serial bus subclass = USB xhci1@pci0:5:0:4: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e1 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Raven USB 3.1' class = serial bus subclass = USB none5@pci0:5:0:5: class=0x048000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e2 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'ACP/ACP3X/ACP6x Audio Coprocessor' class = multimedia hdac1@pci0:5:0:6: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e3 subvendor=0x17aa subdevice=0x5126 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h/19h HD Audio Controller' class = multimedia subclass = HDA

DRM KMOD version drm-61-kmod-6.1.69

To Reproduce Install drm-61-kmod kldload amdgpu

Screenshots Panic message provided below

Additional context

<6>[drm] amdgpu kernel modesetting enabled. drmn0: on vgapci0 vgapci0: child drmn0 requested pci_enable_io vgapci0: child drmn0 requested pci_enable_io <6>[drm] initializing kernel modesetting (RAVEN 0x1002:0x15D8 0x17AA:0x5126 0xD2). <6>[drm] register mmio base: 0xD0500000 <6>[drm] register mmio size: 524288 <6>[drm] add ip block number 0 <6>[drm] add ip block number 1 <6>[drm] add ip block number 2 <6>[drm] add ip block number 3 <6>[drm] add ip block number 4 <6>[drm] add ip block number 5 <6>[drm] add ip block number 6 <6>[drm] add ip block number 7 <6>[drm] add ip block number 8 drmn0: successfully loaded firmware image 'amdgpu/picasso_gpu_info.bin' drmn0: Fetched VBIOS from VFCT <6>amdgpu: ATOM BIOS: 113-PICASSO-117 drmn0: successfully loaded firmware image 'amdgpu/picasso_sdma.bin' <6>[drm] VCN decode is enabled in VM mode <6>[drm] VCN encode is enabled in VM mode <6>[drm] JPEG decode is enabled in VM mode drmn0: Trusted Memory Zone (TMZ) feature enabled drmn0: PCIE atomic ops is not supported <6>[drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit drmn0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used) drmn0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF drmn0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF [drm ERROR :amdgpu_bo_init] Unable to set WC memtype for the aperture base <6>[drm] Detected VRAM RAM=2048M, BAR=2048M <6>[drm] RAM width 128bits DDR4 <6>[drm] amdgpu: 2048M of VRAM memory ready <6>[drm] amdgpu: 7091M of GTT memory ready. <6>[drm] GART: num cpu pages 262144, num gpu pages 262144 <6>[drm] PCIE GART of 1024M enabled. <6>[drm] PTB located at 0x000000F400A00000 drmn0: successfully loaded firmware image 'amdgpu/picasso_asd.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_ta.bin' drmn0: PSP runtime database doesn't exist drmn0: PSP runtime database doesn't exist <6>amdgpu: hwmgr_sw_init smu backed is smu10_smu drmn0: could not load firmware image 'amdgpu/raven_dmcu.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_pfp.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_me.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_ce.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_rlc.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_mec.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_mec2.bin' drmn0: successfully loaded firmware image 'amdgpu/picasso_vcn.bin' <6>[drm] Found VCN firmware Version ENC: 1.13 DEC: 2 VEP: 0 Revision: 4 drmn0: Will use PSP to load VCN firmware <6>[drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR drmn0: RAS: optional ras ta ucode is not available drmn0: RAP: optional rap ta ucode is not available <6>[drm] DM_PPLIB: values for F clock <6>[drm] DM_PPLIB: 400000 in kHz, 2924 in mV <6>[drm] DM_PPLIB: 933000 in kHz, 3249 in mV <6>[drm] DM_PPLIB: 1067000 in kHz, 3924 in mV <6>[drm] DM_PPLIB: 1200000 in kHz, 4074 in mV <6>[drm] DM_PPLIB: values for DCF clock <6>[drm] DM_PPLIB: 300000 in kHz, 2924 in mV <6>[drm] DM_PPLIB: 600000 in kHz, 3249 in mV <6>[drm] DM_PPLIB: 626000 in kHz, 3924 in mV <6>[drm] DM_PPLIB: 654000 in kHz, 4074 in mV panic: Unregistered use of FPU in kernel cpuid = 2 time = 1704878955 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0100b3af70 vpanic() at vpanic+0x131/frame 0xfffffe0100b3b0a0 panic() at panic+0x43/frame 0xfffffe0100b3b100 trap() at trap+0x8dc/frame 0xfffffe0100b3b220 calltrap() at calltrap+0x8/frame 0xfffffe0100b3b220 --- trap 0x16, rip = 0xffffffff8400299b, rsp = 0xfffffe0100b3b2f0, rbp = 0xfffffe0100b3b2f0 --- dcn_bw_sync_calcs_and_dml() at dcn_bw_sync_calcs_and_dml+0xb/frame 0xfffffe0100b3b2f0 dcn10_create_resource_pool() at dcn10_create_resource_pool+0x6a6/frame 0xfffffe0100b3b490 dc_create_resource_pool() at dc_create_resource_pool+0x4c/frame 0xfffffe0100b3b4b0 dc_create() at dc_create+0x330/frame 0xfffffe0100b3b4f0 dm_hw_init() at dm_hw_init+0x3d9/frame 0xfffffe0100b3b6e0 amdgpu_device_ip_hw_init_phase2() at amdgpu_device_ip_hw_init_phase2+0x5a/frame 0xfffffe0100b3b710 amdgpu_device_ip_init() at amdgpu_device_ip_init+0x370/frame 0xfffffe0100b3b790 amdgpu_device_init() at amdgpu_device_init+0x1cdb/frame 0xfffffe0100b3b850 amdgpu_driver_load_kms() at amdgpu_driver_load_kms+0x16/frame 0xfffffe0100b3b880 amdgpu_pci_probe() at amdgpu_pci_probe+0x283/frame 0xfffffe0100b3b8c0 linux_pci_attach_device() at linux_pci_attach_device+0x478/frame 0xfffffe0100b3b910 device_attach() at device_attach+0x3b5/frame 0xfffffe0100b3b960 bus_generic_driver_added() at bus_generic_driver_added+0xa1/frame 0xfffffe0100b3b990 devclass_driver_added() at devclass_driver_added+0x39/frame 0xfffffe0100b3b9d0 devclass_add_driver() at devclass_add_driver+0x11e/frame 0xfffffe0100b3ba10 _linux_pci_register_driver() at _linux_pci_register_driver+0xcc/frame 0xfffffe0100b3ba40 amdgpu_evh() at amdgpu_evh+0x80/frame 0xfffffe0100b3ba50 module_register_init() at module_register_init+0x85/frame 0xfffffe0100b3ba80 linker_load_module() at linker_load_module+0xbf9/frame 0xfffffe0100b3bd70 kern_kldload() at kern_kldload+0x16a/frame 0xfffffe0100b3bdd0 sys_kldload() at sys_kldload+0x5c/frame 0xfffffe0100b3be00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0100b3bf30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0100b3bf30 --- syscall (304, FreeBSD ELF64, kldload), rip = 0x32d5ff8d767a, rsp = 0x32d5fe406be8, rbp = 0x32d5fe407160 --- KDB: enter: panic
wulf7 commented 5 months ago

Test this patch:

diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
index e73f089c84..a4c1e94f79 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
@@ -1561,6 +1561,7 @@ void dcn_bw_notify_pplib_of_wm_ranges(

 void dcn_bw_sync_calcs_and_dml(struct dc *dc)
 {
+   DC_FP_START();
    DC_LOG_BANDWIDTH_CALCS("sr_exit_time: %f ns\n"
            "sr_enter_plus_exit_time: %f ns\n"
            "urgent_latency: %f ns\n"
@@ -1697,6 +1698,7 @@ void dcn_bw_sync_calcs_and_dml(struct dc *dc)
            dc->dcn_ip->can_vstartup_lines_exceed_vsync_plus_back_porch_lines_minus_one,
            dc->dcn_ip->bug_forcing_luma_and_chroma_request_to_same_size_fixed,
            dc->dcn_ip->dcfclk_cstate_latency);
+   DC_FP_END();

    dc->dml.soc.sr_exit_time_us = dc->dcn_soc->sr_exit_time;
    dc->dml.soc.sr_enter_plus_exit_time_us = dc->dcn_soc->sr_enter_plus_exit_time;
jownit commented 5 months ago

I tried the patch, but unfortunately it does not make any difference.

panic: Unregistered use of FPU in kernel cpuid = 5 time = 1706869368 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0100aaff60 vpanic() at vpanic+0x135/frame 0xfffffe0100ab0090 panic() at panic+0x43/frame 0xfffffe0100ab00f0 trap() at trap+0x8dc/frame 0xfffffe0100ab0210 calltrap() at calltrap+0x8/frame 0xfffffe0100ab0210 --- trap 0x16, rip = 0xffffffff84001001, rsp = 0xfffffe0100ab02e8, rbp = 0xfffffe0100ab02f0 --- dcn_bw_sync_calcs_and_dml() at dcn_bw_sync_calcs_and_dml+0x31/frame 0xfffffe0100ab02f0 dcn10_create_resource_pool() at dcn10_create_resource_pool+0x6a6/frame 0xfffffe0100ab0490 dc_create_resource_pool() at dc_create_resource_pool+0x4c/frame 0xfffffe0100ab04b0 dc_create() at dc_create+0x330/frame 0xfffffe0100ab04f0 dm_hw_init() at dm_hw_init+0x3d9/frame 0xfffffe0100ab06e0 amdgpu_device_ip_hw_init_phase2() at amdgpu_device_ip_hw_init_phase2+0x5a/frame 0xfffffe0100ab0710 amdgpu_device_ip_init() at amdgpu_device_ip_init+0x370/frame 0xfffffe0100ab0790 amdgpu_device_init() at amdgpu_device_init+0x1cdb/frame 0xfffffe0100ab0850 amdgpu_driver_load_kms() at amdgpu_driver_load_kms+0x16/frame 0xfffffe0100ab0880 amdgpu_pci_probe() at amdgpu_pci_probe+0x283/frame 0xfffffe0100ab08c0 linux_pci_attach_device() at linux_pci_attach_device+0x478/frame 0xfffffe0100ab0910 device_attach() at device_attach+0x3b5/frame 0xfffffe0100ab0960 bus_generic_driver_added() at bus_generic_driver_added+0xa1/frame 0xfffffe0100ab0990 devclass_driver_added() at devclass_driver_added+0x39/frame 0xfffffe0100ab09d0 devclass_add_driver() at devclass_add_driver+0x11e/frame 0xfffffe0100ab0a10 _linux_pci_register_driver() at _linux_pci_register_driver+0xcc/frame 0xfffffe0100ab0a40 amdgpu_evh() at amdgpu_evh+0x80/frame 0xfffffe0100ab0a50 module_register_init() at module_register_init+0x85/frame 0xfffffe0100ab0a80 linker_load_module() at linker_load_module+0xbf9/frame 0xfffffe0100ab0d70 kern_kldload() at kern_kldload+0x16a/frame 0xfffffe0100ab0dd0 sys_kldload() at sys_kldload+0x5c/frame 0xfffffe0100ab0e00 amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0100ab0f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0100ab0f30 --- syscall (304, FreeBSD ELF64, kldload), rip = 0x331be343b69a, rsp = 0x331be1d001c8, rbp = 0x331be1d00740 --- KDB: enter: panic

wulf7 commented 5 months ago

Ok. Then let`s increase DC_FP scope:

diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
index e73f089c84..22613780c3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c
@@ -1561,6 +1561,7 @@ void dcn_bw_notify_pplib_of_wm_ranges(

 void dcn_bw_sync_calcs_and_dml(struct dc *dc)
 {
+   DC_FP_START();
    DC_LOG_BANDWIDTH_CALCS("sr_exit_time: %f ns\n"
            "sr_enter_plus_exit_time: %f ns\n"
            "urgent_latency: %f ns\n"
@@ -1749,4 +1750,5 @@ void dcn_bw_sync_calcs_and_dml(struct dc *dc)
    dc->dml.ip.bug_forcing_LC_req_same_size_fixed =
        dc->dcn_ip->bug_forcing_luma_and_chroma_request_to_same_size_fixed == dcn_bw_yes;
    dc->dml.ip.dcfclk_cstate_latency = dc->dcn_ip->dcfclk_cstate_latency;
+   DC_FP_END();
 }
jownit commented 5 months ago

That made a difference! No panic, and my external display works. Looks good so far, X11 starts and looks like before, when using drm-510-kmod

evadot commented 4 months ago

Same problem on : vgapci0@pci0:8:0:0: class=0x030000 rev=0x81 hdr=0x00 vendor=0x1002 device=0x15dd subvendor=0x1002 subdevice=0x15dd vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series]' class = display subclass = VGA

Patch works here too.

alfonsosiciliano commented 4 months ago

Patch works also for: Device: PCI 1002:15d8 Advanced Micro Devices, Inc. [AMD/ATI] Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile Series]

Thanks!

lbartoletti commented 4 months ago

(Maybe a different problem? But,) I also have a panic using drm-61-kmod on FreeBSD 15.0

amd_panic

It's an APU (raphael) on aAMD Ryzen 9 7900X

b.f.o issue for reference: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268394#c10

alfonsosiciliano commented 4 months ago

(Maybe a different problem? But,) I also have a panic using drm-61-kmod on FreeBSD 15.0

amd_panic

It's an APU (raphael) on aAMD Ryzen 9 7900X

b.f.o issue for reference: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268394#c10

Sorry, unfortunately I can't "read" the image.

Is the problem: "panic: Unregistered use of FPU"? Do core file and kgdb backtrace refer to the "dcn_bw_sync_calcs_and_dml()" function? Does the problem occur after the patch "DC_FP_START\END()"?

wulf7 commented 4 months ago

I pushed slightly different version of the patch to 6.1-lts branch. If it still works we can update drm-61-kmod port

wulf7 commented 4 months ago

(Maybe a different problem? But,) I also have a panic using drm-61-kmod on FreeBSD 15.0

It's an APU (raphael) on aAMD Ryzen 9 7900X

b.f.o issue for reference: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268394#c10

That is different panic.

What you see is a driver unloading bug (misplaced vm_phys_fictitious_unreg_range() call). It is triggered by a fatal error (not by the driver crash!) during initialization. You should check message buffer content rather than backtrace in that case. Most probably the reason of error is unsupported GPU or missing firmware module. Try to install all FLAVORs of graphics/gpu-firmware-amd-kmod and if it does not help you may try WIP 6.6 branch.

lbartoletti commented 3 months ago

(Maybe a different problem? But,) I also have a panic using drm-61-kmod on FreeBSD 15.0 It's an APU (raphael) on aAMD Ryzen 9 7900X b.f.o issue for reference: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268394#c10

That is different panic.

What you see is a driver unloading bug (misplaced vm_phys_fictitious_unreg_range() call). It is triggered by a fatal error (not by the driver crash!) during initialization. You should check message buffer content rather than backtrace in that case. Most probably the reason of error is unsupported GPU or missing firmware module. Try to install all FLAVORs of graphics/gpu-firmware-amd-kmod and if it does not help you may try WIP 6.6 branch.

Thanks, and sorry for my late answer.

Different problem now with 61-lts, I can load amdgpu, but now, I have black screen.

I was unable to compile the 6.6 branch due to a missing header:

evadot commented 1 month ago

Closing as the original bug is fixed.