freebsd / drm-kmod

drm driver for FreeBSD
148 stars 68 forks source link

amdgpu not doing anything #266

Closed magicalskye closed 6 months ago

magicalskye commented 7 months ago

Describe the bug I can technically load the module but it does not actually give me a usable video card driver

FreeBSD version

FreeBSD alfred 14.0-RELEASE FreeBSD 14.0-RELEASE #0 releng/14.0-n265380-f9716eee8ab4: Fri Nov 10 05:57:23 UTC 2023     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 1400097 1400097

PCI Info

pciconf -lv hostb0@pci0:0:0:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1630 subvendor=0x1022 subdevice=0x1630 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne Root Complex' class = bridge subclass = HOST-PCI none0@pci0:0:0:2: class=0x080600 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1631 subvendor=0x1022 subdevice=0x1631 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne IOMMU' class = base peripheral subclass = IOMMU hostb1@pci0:0:1:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1633 subvendor=0x1022 subdevice=0x1453 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe GPP Bridge' class = bridge subclass = PCI-PCI hostb2@pci0:0:2:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib4@pci0:0:2:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1634 subvendor=0x1022 subdevice=0x1453 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne PCIe GPP Bridge' class = bridge subclass = PCI-PCI hostb3@pci0:0:8:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib7@pci0:0:8:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1635 subvendor=0x1022 subdevice=0x1635 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Internal PCIe GPP Bridge to Bus' class = bridge subclass = PCI-PCI intsmb0@pci0:0:20:0: class=0x0c0500 rev=0x51 hdr=0x00 vendor=0x1022 device=0x790b subvendor=0x1458 subdevice=0x5001 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH SMBus Controller' class = serial bus subclass = SMBus isab0@pci0:0:20:3: class=0x060100 rev=0x51 hdr=0x00 vendor=0x1022 device=0x790e subvendor=0x1458 subdevice=0x5001 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH LPC Bridge' class = bridge subclass = PCI-ISA hostb4@pci0:0:24:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166a subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 0' class = bridge subclass = HOST-PCI hostb5@pci0:0:24:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166b subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 1' class = bridge subclass = HOST-PCI hostb6@pci0:0:24:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166c subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 2' class = bridge subclass = HOST-PCI hostb7@pci0:0:24:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166d subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 3' class = bridge subclass = HOST-PCI hostb8@pci0:0:24:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166e subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 4' class = bridge subclass = HOST-PCI hostb9@pci0:0:24:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x166f subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 5' class = bridge subclass = HOST-PCI hostb10@pci0:0:24:6: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1670 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 6' class = bridge subclass = HOST-PCI hostb11@pci0:0:24:7: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1671 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Cezanne Data Fabric; Function 7' class = bridge subclass = HOST-PCI pcib2@pci0:1:0:0: class=0x060400 rev=0xc1 hdr=0x01 vendor=0x1002 device=0x1478 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Navi 10 XL Upstream Port of PCI Express Switch' class = bridge subclass = PCI-PCI pcib3@pci0:2:0:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1002 device=0x1479 subvendor=0x1002 subdevice=0x1479 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Navi 10 XL Downstream Port of PCI Express Switch' class = bridge subclass = PCI-PCI vgapci0@pci0:3:0:0: class=0x030000 rev=0xc1 hdr=0x00 vendor=0x1002 device=0x743f subvendor=0x1043 subdevice=0x05db vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Navi 24 [Radeon RX 6400/6500 XT/6500M]' class = display subclass = VGA hdac0@pci0:3:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0xab28 subvendor=0x1002 subdevice=0xab28 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Navi 21/23 HDMI/DP Audio Controller' class = multimedia subclass = HDA xhci0@pci0:4:0:0: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x43ee subvendor=0x1b21 subdevice=0x1142 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = '500 Series Chipset USB 3.1 XHCI Controller' class = serial bus subclass = USB ahci0@pci0:4:0:1: class=0x010601 rev=0x00 hdr=0x00 vendor=0x1022 device=0x43eb subvendor=0x1b21 subdevice=0x1062 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = '500 Series Chipset SATA Controller' class = mass storage subclass = SATA pcib5@pci0:4:0:2: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x43e9 subvendor=0x1b21 subdevice=0x0201 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = '500 Series Chipset Switch Upstream Port' class = bridge subclass = PCI-PCI pcib6@pci0:5:9:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x43ea subvendor=0x1b21 subdevice=0x3308 vendor = 'Advanced Micro Devices, Inc. [AMD]' class = bridge subclass = PCI-PCI re0@pci0:6:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x1458 subdevice=0xe000 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet none1@pci0:7:0:0: class=0x130000 rev=0xc9 hdr=0x00 vendor=0x1022 device=0x145a subvendor=0x1458 subdevice=0xd000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Zeppelin/Raven/Raven2 PCIe Dummy Function' class = non-essential instrumentation hdac1@pci0:7:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0x1637 subvendor=0x1002 subdevice=0x1637 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Renoir Radeon High Definition Audio Controller' class = multimedia subclass = HDA none2@pci0:7:0:2: class=0x108000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15df subvendor=0x1022 subdevice=0x15df vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 10h-1fh) Platform Security Processor' class = encrypt/decrypt xhci1@pci0:7:0:3: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1639 subvendor=0x1458 subdevice=0x5007 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne USB 3.1' class = serial bus subclass = USB xhci2@pci0:7:0:4: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1639 subvendor=0x1458 subdevice=0x5007 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne USB 3.1' class = serial bus subclass = USB hdac2@pci0:7:0:6: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e3 subvendor=0x1458 subdevice=0xa194 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h/19h HD Audio Controller' class = multimedia subclass = HDA

DRM KMOD version drm-515-kmod 5.15.118_1

To Reproduce kldload amdgpu

Syslog

Nov 23 04:15:53 alfred kernel: [drm] amdgpu kernel modesetting enabled.
Nov 23 04:15:53 alfred kernel: drmn0: <drmn> on vgapci0
Nov 23 04:15:53 alfred kernel: vgapci0: child drmn0 requested pci_enable_io
Nov 23 04:15:53 alfred syslogd: last message repeated 1 times
Nov 23 04:15:53 alfred kernel: [drm] initializing kernel modesetting (BEIGE_GOBY 0x1002:0x743F 0x1043:0x05DB 0xC1).
Nov 23 04:15:53 alfred kernel: drmn0: Trusted Memory Zone (TMZ) feature not supported
Nov 23 04:15:53 alfred kernel: [drm] register mmio base: 0xFCB00000
Nov 23 04:15:53 alfred kernel: [drm] register mmio size: 1048576
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 0 <nv_common>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 1 <gmc_v10_0>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 2 <navi10_ih>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 3 <psp>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 4 <smu>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 5 <gfx_v10_0>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 6 <sdma_v5_2>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 7 <dm>
Nov 23 04:15:53 alfred kernel: [drm] add ip block number 8 <vcn_v3_0>
Nov 23 04:15:53 alfred kernel: drmn0: Fetched VBIOS from ROM BAR
Nov 23 04:15:53 alfred kernel: amdgpu: ATOM BIOS: 115-D632BP0-100
Nov 23 04:15:53 alfred kernel: [drm] VCN(0) decode is enabled in VM mode
Nov 23 04:15:53 alfred kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
Nov 23 04:15:53 alfred kernel: drmn0: VRAM: 4080M 0x0000008000000000 - 0x00000080FEFFFFFF (4080M used)
Nov 23 04:15:53 alfred kernel: drmn0: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
Nov 23 04:15:53 alfred kernel: drmn0: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
Nov 23 04:15:53 alfred kernel: [drm] Detected VRAM RAM=4080M, BAR=256M
Nov 23 04:15:53 alfred kernel: [drm] RAM width 64bits GDDR6
Nov 23 04:15:53 alfred kernel: [drm] amdgpu: 4080M of VRAM memory ready
Nov 23 04:15:53 alfred kernel: [drm] amdgpu: 4080M of GTT memory ready.
Nov 23 04:15:53 alfred kernel: sysctl_warn_reuse: can't re-use a leaf (sys.device.drmn0.mem_info_preempt_used)!
Nov 23 04:15:53 alfred kernel: [drm ERROR :amdgpu_preempt_mgr_init] Failed to create device file mem_info_preempt_used
Nov 23 04:15:53 alfred kernel: [drm ERROR :amdgpu_ttm_init] Failed initializing PREEMPT heap.
Nov 23 04:15:53 alfred kernel: [drm ERROR :amdgpu_device_ip_init] sw_init of IP block <gmc_v10_0> failed -12
Nov 23 04:15:53 alfred kernel: drmn0: amdgpu_device_ip_init failed
Nov 23 04:15:53 alfred kernel: drmn0: Fatal error during GPU init
Nov 23 04:15:53 alfred kernel: drmn0: amdgpu: finishing device.
Nov 23 04:15:53 alfred kernel: device_attach: drmn0 attach returned 12

Additional context I made the mistake of trying to update all my packages with portmaster and then it installed drm-515-kmod and that completely refused to load, freezing my system instead. I thought it might just be a broken version again so I tried the binary pkg instead, then I tried it from github, same thing. So I tried to downgrade again to drm-510-kmod, including the exact version that used to work before, and that loads with the ”modesetting enabled” message but basically it’s useless, it does not give me the ability to start xorg, nor does it make the console pretty like it normally does (I am aware that I sound somewhat braindead here but “pretty console” and “outputs on both screens” is the only way I know of to tell if the video card driver is working properly without the added complexity of X, sorry). After that I did a world/kernel upgrade in the hopes that this would maybe change something but it did not.

So I concluded that I must have somehow completely messed up my entire system beyond repair and did a totally fresh installation, installing nothing except a text editor and drm-kmod. First I pkg installed the 510, but same behaviour, it loads fine but does not actually do anything. Then I pkg removed that and pkg installed the 515 and on this fresh installation it loads without freezing my system but ALSO does not do anything AND it gives me some questionable syslog messages, as seen above. Unfortunately with wiping my system I also wiped the version that used to work and I don’t remember which commit this was so I can’t really test it anymore but since it stopped working for unclear reasons I don’t see much hope in that anyway.

Also when I do kldunload amdgpu, I get

Nov 23 04:28:44 alfred kernel: Warning: memory type drm_managed leaked memory on destroy (9 allocations, 576 bytes leaked).
Nov 23 04:28:44 alfred kernel: Warning: memory type debugfsint leaked memory on destroy (2 allocations, 80 bytes leaked).

Istg I am never doing updates again *cries* I am happy to help debugging and I swear I’m not as stupid as I sound, I’m just very new to FreeBSD and don’t really know how all the pieces go together yet

magicalskye commented 7 months ago

Never mind I was just being stupid again, sorry.

Turns out:

After installing the correct firmware I can load the module now and it actually works (well, I get the pretty screens at least, whether or not it’s actually stable remains to be seen, I don’t trust the memory management)

But when I do kldunload then, it just completely crashes my computer. Black screen, no response on network, RIP.

So maybe the memory preempt error is still good for something?

valpackett commented 7 months ago

when I do kldunload then, it just completely crashes my computer

Not sure we ever got unload working on amdgpu, only i915 IIRC