Closed tsujp closed 2 years ago
dcn20_validate_bandwidth
Looks like something that was fixed by #20? Try building from git instead of ports (though I think an update recently did land into ports..)
Also try 5.5-wip from https://github.com/freebsd/drm-kmod/pull/40
@tsujp With drm-devel-kmod-5.4.62.g20201109 the 5700XT works for me. You might want to give it a try again.
However, only with 1 monitor connected. if I connect a second monitor, the system hangs itself.
@myfreeweb
dcn20_validate_bandwidth
Looks like something that was fixed by #20? Try building from git instead of ports (though I think an update recently did land into ports..)
Also try 5.5-wip from #40
How to test the 5.5-wip branch ? I can build it successfully and install the modules. However, when I try to load them it fails with:
Loading kernel modules: link_elf_obj: symbol ttm_bo_mmap_obj undefined Warning: memory type debugfsint leaked memory on destroy (2 allocations, 80 bytes leaked). linker_load_file: /boot/modules/drm.ko - unsupported file type KLD amdgpu.ko: depends on drmn - not available or version mismatch linker_load_file: /boot/modules/amdgpu.ko - unsupported file type kldload: an error occurred while loading module amdgpu. Please check dmesg(8) for more details.
Test machine runs:
FreeBSD devil 13.0-CURRENT FreeBSD 13.0-CURRENT #0 r368710
@pbpjackd don't use the 5.5-wip branch of this repo right now, I said #40
something like:
git fetch origin pull/40/head:5.5-amd
git checkout 5.5-amd
or
git remote add myfreeweb https://github.com/myfreeweb/drm-kmod
git fetch --all
git checkout 5.5-wip-amd-pr
I will have the time to give FreeBSD a ago again sometime in the new year, unfortunately all the storage interfaces on my motherboard are taken right now so when I have a free drive I can partition it and attempt FreeBSD again; if that's before 13-RELEASE I'll give these things a shot thank you @myfreeweb and @pbpjackd.
Any updates ?
@evadot Now on FBSD 13 with latest DRM from ports I do not see any crashes or whatsoever.
However two issues I have which might be related to DRM.
Not sure if this belongs here. Maybe I should file a new ticket ?
Trying on the September 16th 14-CURRENT snapshot with a PowerColor 5700XT, running into some similar issues with this card still. I've also messed around with other recent 14 snapshots with identical results.
With drm-current-kmod-5.4.144.g20210908 from ports, I am able to get amdgpu to load and can start X fine, however in the same vein as @pbpjackd I am getting kernel panics when I attempt to use multihead. If amdgpu loads with two displays plugged in, the kernel will immediately panic, however bizarrely enough, waiting until after it is loaded and then plug in a second display, I get mirrored output on that display, only for the kernel to panic when I then start X. However, one display works perfectly fine, and this is without hw.syscons.disable=1 or any additional measures taken. Since I can get it to mirror displays on a TTY/console unpanicked prior to starting X, I am somewhat tempted to see if I can get multihead working with some sort of Wayland solution at some point, since it's only when I start X that a panic is guaranteed.
With drm-devel-kmod-5.5.19.g20210909 from ports, the system will hang on loading modules with no output. Adding hw.syscons.disable=1 to /boot/loader.conf results in a small colorful strip about ~20 pixels in height forming across the top of the screen at boot and amdgpu never seems to fully take over - I am never able to blindly input anything here either, leading me to believe it is hanging at the same point. I end up needing to boot into single user where I blindly input commands to remove hw.syscons.disable=1 from /boot/loader.conf and continue receiving output in the future. This occurs when using hw.syscons.disable=1 on any drm-kmod version with this card, including drm-current-kmod in ports, which otherwise works mostly fine for me. I suspect the option isn't necessary for me considering 5.4 works fine without it, and it only seems to cause issues regardless of version.
I attempted to use some of the branches/PRs mentioned in https://github.com/FreeBSDDesktop/kms-drm/issues/255 and others mentioned here, specifically the ones in https://github.com/unrelentingtech/drm-kmod/ by @unrelentingtech just to see if I got any different results. My attempts include trying some of the newer unmentioned branches that've cropped up there since, like the 5.6-wip and 5.7 branches, even the DankBSD ones, since they seemed to contain work primarily on amdgpu. I have also tried running what's currently in git here for 5.5. However, it seems that with anything newer than 5.4, I consistently get the same issues mentioned above with the drm-devel-kmod version in ports.
I suspect there may have been some sort of regression somewhere at some point, as I cannot get even one display to output anything useful and not hang using any branch or repo on any version greater than the 5.4.X ones on my 5700XT. Various 5.4.X versions consistently seem to work to the extent described above. I wish I had more to show you, but due to the nature of the issue and lack of output, I don't have much to provide you with other than recounting my anecdotal experiences.
If @pbpjackd ever found a solution on getting multihead to work on any of the 5.4.X versions, I would greatly appreciate hearing back on that. Lack of working multihead is the only reason I've been messing around with all these other versions, forks, and branches.
With regard to everything I've said, I've seen no difference in behavior with manually loading amdgpu as opposed to listing it in /etc/rc.conf under kld_list.
I am happy to give any suggestions you have a shot as it seems little progress has been made on this issue in the past... nearly a year at this point, and I don't have anything better to do.
I suspect the option isn't necessary for me
True. hw.syscons.disable=1
is irrelevant after #61 was merged, which is in master
, 5.5-stable
and 5.6-wip
. Also it was always only relevant if your screen resolution was big enough to actually cause the conflict to happen.
I suspect there may have been some sort of regression somewhere at some point
There might be! I think the new FPU context stuff has only been actually tested on Renoir. (Renoir has dcn21 while Navi10 has dcn20)
Could it be #103 not getting merged into master
? (#114) Though if
I have also tried running what's currently in git here for 5.5
means 5.5-stable, that should not be the problem.
(upd: since you're saying "snapshots" you are using a GENERIC (with-INVARIANTS) kernel, right? #103 is related to that. with a GENERIC-NODEBUG this issue should be avoided. this might also explain 5.6 not working, 5.6 doesn't yet have a thing that avoids the INVARIANTS failure)
(upd: since you're saying "snapshots" you are using a GENERIC (with-INVARIANTS) kernel, right? #103 is related to that. with a GENERIC-NODEBUG this issue should be avoided. this might also explain 5.6 not working, 5.6 doesn't yet have a thing that avoids the INVARIANTS failure)
Wow, guess I'm out of the loop. Compiling a NODEBUG kernel fixed both the multihead issues I was having on 5.4 and got 5.5 to seemingly work normally as well. I'll have to mess around more after I get some rest. This is pretty great, thank you for getting back to me on that so soon. I was getting ready to shamefully sulk back to my old Gentoo install so I could get things done, I haven't used FreeBSD (on desktops at least) very extensively prior to these past few days.
got 5.5 to seemingly work normally as well
Well then it likely is #114. To be sure, please confirm what you mean by 5.5 — master
or 5.5-stable
? 5.5-stable
currently should work on a debug (INVARIANTS) kernel, master
only will after #114. Ports releases are cut from master
.
fixed […] the multihead issues I was having on 5.4
Now that is interesting, haha.
got 5.5 to seemingly work normally as well
please confirm what you mean by 5.5
I was referring to the version currently in ports as drm-devel-kmod, not one I snagged from git or the stable release. But I can test that later at some point if need be.
Might also give 5.6-wip a shot soon.
currently in ports as drm-devel-kmod
Yep, that's a tag off of master
, and it doesn't have my #103 fix. Makes sense. #114 should fix.
5.15 is a long way off but that further improves things and fixes the janky (and brittle) FPU use once and for all by doing it properly (ie have the non-FPU-enabled C files enable the FPU when they call into the FPU-using C files), which Linux needed to be able to support AArch64 (without -mgeneral-regs-only compilers like to use the qN registers on AArch64 for inlined memcpy's).
(and thus, when drm-kmod eventually catches up to that, any hacks like #114 should finally be gone)
So, where are we on this ? Is it solved ?
Yes? This was just another report of The FPU Issue. We probably would've gotten a lot of reports from navi10 users if FPU stuff broke again :)
Unsure of the exact cause here as I was trying out FreeBSD today but unable to get past a base vanilla install; stuck at installing GPU drivers.
I installed
drm-devel-kmod
from ports but when attempting to load/boot/modules/amdgpu.ko
either within/etc/rc.conf
or manually viakldload
the machine auto-restarts or hangs with no display output as I've been dropped into a debugger.both
sysctl debug.debugger_on_panic=0
and/orsysctl debug.witness.watch=-1
followed bykldload /boot/modules/amdgpu.ko
do nothing to change this.Here are some logs, if I am missing more please instruct on how to provide them (very, very new to FreeBSD).
Very long logs spoiler
``` tbsd dumped core - see /var/crash/vmcore.1 Sat Nov 28 20:26:28 AWST 2020 FreeBSD tbsd 13.0-CURRENT FreeBSD 13.0-CURRENT #0 9e082d278b9-c254726(main): Thu Nov 26 04:50:43 UTC 2020 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 panic: dummy ctx GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: taskqueue_drain with the following non-sleepable locks held: exclusive sleep mutex vtdev (vtdev) r = 0 (0xffffffff818e6350) locked @ /usr/src/sys/dev/vt/vt_core.c:2825 stack backtrace: #0 0xffffffff80c540b1 at witnespanic: dummy ctx cpuid = 13 time = 1606566339 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00aa6044d8 vpanic() at vpanic+0x181/frame 0xfffffe00aa604528 panic() at panic+0x43/frame 0xfffffe00aa604588 fpu_kern_leave() at fpu_kern_leave+0x21c/frame 0xfffffe00aa6045b8 dcn20_validate_bandwidth() at dcn20_validate_bandwidth+0x15f/frame 0xfffffe00aa6045f0 dc_validate_global_state() at dc_validate_global_state+0x2ce/frame 0xfffffe00aa604650 amdgpu_dm_atomic_check() at amdgpu_dm_atomic_check+0xff0/frame 0xfffffe00aa604930 drm_atomic_check_only() at drm_atomic_check_only+0x400/frame 0xfffffe00aa6049b0 drm_atomic_commit() at drm_atomic_commit+0x13/frame 0xfffffe00aa6049d0 drm_client_modeset_commit_atomic() at drm_client_modeset_commit_atomic+0x148/frame 0xfffffe00aa604a40 drm_client_modeset_commit_force() at drm_client_modeset_commit_force+0x69/frame 0xfffffe00aa604a90 drm_fb_helper_restore_fbdev_mode_unlocked() at drm_fb_helper_restore_fbdev_mode_unlocked+0x7a/frame 0xfffffe00aa604ac0 taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe00aa604b40 taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 0xfffffe00aa604b70 fork_exit() at fork_exit+0x80/frame 0xfffffe00aa604bb0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00aa604bb0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 2m22s Dumping 1213 out of 32687 MB:..2%..11%..22%..31%..41%..51%..61%..72%..81%..91% No symbol "zombproc" in current context. Reading symbols from /boot/kernel/acpi_wmi.ko...Reading symbols from /usr/lib/debug//boot/kernel/acpi_wmi.ko.debug...done. done. Loaded symbols for /boot/kernel/acpi_wmi.ko Reading symbols from /boot/kernel/if_iwm.ko...Reading symbols from /usr/lib/debug//boot/kernel/if_iwm.ko.debug...done. done. Loaded symbols for /boot/kernel/if_iwm.ko Reading symbols from /boot/kernel/iwm3168fw.ko...Reading symbols from /usr/lib/debug//boot/kernel/iwm3168fw.ko.debug...done. done. Loaded symbols for /boot/kernel/iwm3168fw.ko Reading symbols from /boot/kernel/intpm.ko...Reading symbols from /usr/lib/debug//boot/kernel/intpm.ko.debug...done. done. Loaded symbols for /boot/kernel/intpm.ko Reading symbols from /boot/kernel/smbus.ko...Reading symbols from /usr/lib/debug//boot/kernel/smbus.ko.debug...done. done. Loaded symbols for /boot/kernel/smbus.ko Reading symbols from /boot/kernel/uhid.ko...Reading symbols from /usr/lib/debug//boot/kernel/uhid.ko.debug...done. done. Loaded symbols for /boot/kernel/uhid.ko Reading symbols from /boot/kernel/wmt.ko...Reading symbols from /usr/lib/debug//boot/kernel/wmt.ko.debug...done. done. Loaded symbols for /boot/kernel/wmt.ko Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done. done. Loaded symbols for /boot/kernel/ums.ko Reading symbols from /boot/kernel/ng_ubt.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_ubt.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_ubt.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_hci.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_hci.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_hci.ko Reading symbols from /boot/kernel/ng_bluetooth.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_bluetooth.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_bluetooth.ko Reading symbols from /boot/kernel/snd_uaudio.ko...Reading symbols from /usr/lib/debug//boot/kernel/snd_uaudio.ko.debug...done. done. Loaded symbols for /boot/kernel/snd_uaudio.ko Reading symbols from /boot/kernel/ng_l2cap.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_l2cap.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_l2cap.ko Reading symbols from /boot/kernel/ng_btsocket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_btsocket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_btsocket.ko Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/modules/amdgpu.ko...done. Loaded symbols for /boot/modules/amdgpu.ko Reading symbols from /boot/modules/drm.ko...done. Loaded symbols for /boot/modules/drm.ko Reading symbols from /boot/kernel/linuxkpi.ko...Reading symbols from /usr/lib/debug//boot/kernel/linuxkpi.ko.debug...done. done. Loaded symbols for /boot/kernel/linuxkpi.ko Reading symbols from /boot/kernel/backlight.ko...Reading symbols from /usr/lib/debug//boot/kernel/backlight.ko.debug...done. done. Loaded symbols for /boot/kernel/backlight.ko Reading symbols from /boot/modules/linuxkpi_gplv2.ko...done. Loaded symbols for /boot/modules/linuxkpi_gplv2.ko Reading symbols from /boot/kernel/lindebugfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/lindebugfs.ko.debug...done. done. Loaded symbols for /boot/kernel/lindebugfs.ko Reading symbols from /boot/modules/ttm.ko...done. Loaded symbols for /boot/modules/ttm.ko Reading symbols from /boot/modules/amdgpu_navi10_gpu_info_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_gpu_info_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_sos_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_sos_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_asd_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_asd_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_smc_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_smc_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_pfp_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_pfp_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_me_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_me_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_ce_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_ce_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_rlc_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_rlc_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_mec_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_mec_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_mec2_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_mec2_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_sdma_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_sdma_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_sdma1_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_sdma1_bin.ko Reading symbols from /boot/modules/amdgpu_navi10_vcn_bin.ko...done. Loaded symbols for /boot/modules/amdgpu_navi10_vcn_bin.ko #0 doadump (textdump=1) at src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, (kgdb) #0 doadump (textdump=1) at src/sys/amd64/include/pcpu_aux.h:55 #1 0xffffffff80be4ea0 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486 #2 0xffffffff80be5300 in vpanic (fmt=