Open rkitover opened 3 years ago
Well do you have the firmware installed? It's packaged as gpu-firmware-kmod
.
Yes I have the latest drm-devel-kmod
and gpu-firmware-kmod
installed from the ports git.
Weird. Can you manually kldload amdgpu_navi10_gpu_info_bin
?
I will try right now.
So that loads fine, but it seems that amdgpu dies with a backtrace which I need to get with hw.syscons.disable=1
which might be tricky.
Drop the syscons.disable
and just make sure the loader is not in the highest/"native" resolution (check with the gop
command in the loader prompt) (set efi_max_resolution="1080p"
or efi_max_resolution="720p" or whatever would make it smaller in
loader.conf`), there should be no framebuffer problems in that case
Nice, thank you, I will try that.
I set the resolution to 1440p and the framebuffer started at 1600x1200.
I ran:
kldload amdgpu_navi10_gpu_info_bin
kldload amdgpu
I got this backtrace:
Here is my system information:
Motherboard: Supermicro H11DSi
CPUs: 2x 32 core first gen AMD epyc
RAM: 128gb ECC
GPUs: 2x 5700xt
FreeBSD: latest current git
Packages: latest from pkg
Ports: latest git versions of drm-devel-kmod
and gpu-firmware-kmod
If you would like to debug this, I'll be happy to do whatever is needed.
2x 5700xt
huh. well the panic is that the driver is trying to create /dev/dri/renderD128
twice (error code 17 is EEXIST
). Looks like the first GPU failed to attach for some actual, more serious reason (drmn0 attach returned 2
at the very top line), and we don't clean everything up in that case, so the second GPU fails with that.
Can you scroll up (with Scroll Lock) in the console to see what's up with the first GPU?
This time I got somewhat different behavior, I ran:
kldload amdgpu
and it initialized the first card and loaded the firmware, I could see the firmware modules in kldstat
. However, it failed to initialize the second card.
Here is the first card being successfully initialized:
Here is the second card failing to initialize:
I then tried running xorg with amdgpu to see if it would start on the first card, but I got a panic:
hm, same trace as https://github.com/freebsd/drm-kmod/issues/36 with the vm_page_busy_acquire
Try building the newer driver from https://github.com/freebsd/drm-kmod/pull/40 (https://github.com/myfreeweb/drm-kmod/tree/5.5-wip-amd-pr)
I built and installed that branch, and now I get this backtrace on kldload amdgpu
, this is the first page:
and this is the second page:
welcome to the navi FPU kernel context issues suffering club :D (https://github.com/freebsd/drm-kmod/issues/42 etc)
Please try again (git pull
to get the latest commit https://github.com/freebsd/drm-kmod/pull/40/commits/c4cc8385313833aeea34b702a719c2a1f819d40a)
Thank you, that seems to have gotten further, but still panics:
Looks to be dying in the same function.
Oh, right. *facepalm* Try again https://github.com/freebsd/drm-kmod/pull/40/commits/7693e3a492da031171f33cd2d239392a6ae861f1
Just tried this, module loads and initializes the first GPU, fails to initialize the second GPU, then locks up hard when I start xorg.
hm. Does it work with only one GPU installed?
Will try today!
Miraculously, it works. I am typing this in KDE on amdgpu now!
Some problems with KDE, but I'll try to work on that.
So what are the next steps here.
I can play with the code a bit if you tell me where to look and what to look at.
Oh, right. facepalm Try again freebsd/drm-kmod@7693e3a
Works for Renoir 4750G.
Update: drm-kmod source, https://github.com/unrelentingtech/drm-kmod/commits/5.5-wip-amd-pr (7693e3a492da031171f33cd2d239392a6ae861f1)
FreeBSD current, https://github.com/freebsd/freebsd-src, before commit 50180d2b52cc16ecb6a6617fdc53f5d83c71a8b4 (included), and patched with commit 9f47eeffa3cfdcb512e2011fb00fc23c7c1a7d75 for this issue.
As a temporary measure, since I do want to put my second GPU back in, is there some way, perhaps in loader.conf, to disable my second GPU so that amdgpu does not try to initialize it?
Maybe in /boot/device.hints
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/device-hints.html
Something like hint.drmn.1.disabled=1
or hint.drm.1.disabled=1
could work? (not sure what exactly the driver name would be)
I will try, thank you very much.
Before I put the second GPU back in, I determined that the correct loader.conf
invocation is as you said:
hint.drmn.1.disabled=1
with the second GPU back in, it initializes the first GPU but panics when xorg is being started:
Huh. So reproducibly, always when the second GPU is present (but not even initialized, no dmesg lines for it), there are vm_fault
panics, but they don't happen without the second GPU? I guess something in our memory code doesn't handle multiple GPUs :/
Well I could try playing with the code to at least get more information about this, do you have any tips for working with this codebase and debugging etc. and any specific places I should look at to start?
Also I wanted to say that I really appreciate all your help on this, can I buy you a case of beer?
tips for working with this codebase and debugging etc
https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
any specific places I should look at
Well the functions in the backtrace here I guess..
(Really, first just confirm that this is reproducible, i.e. every time you have a second GPU this crash happens and every time you don't have it, it doesn't.)
can I buy you a case of beer?
I don't drink :P
Sure that's easy enough to do, I can just unplug the power cable.
I don't drink :P
I meant like, do you have a paypal or patreon or whatever link for sponsoring freebsd development, using your beverage of choice.
@myfreeweb I have once again verified that this is the case. If I unplug the pcie power from my second GPU then I can start xorg on amdgpu.
Also I realized that I can just do this for now, unplug the pcie power when I want to boot FreeBSD. At least for now.
This is a huge improvement over my previous situation where my only choices where an NVIDIA GPU or scfb, thank you very much.
KDE does not seem to be working very well for me here, I'll do the necessary follow up work on that, but in the meantime if I can't fix it I"ll just install xfce so I have a working desktop and can start playing with FreeBSD as a daily driver.
Once I do that, I will look at this backtrace and see if I can do anything.
I get this in /var/log/messages:
I am using latest git current with latest git drm-devel-kmod and gpu-firmware-kmod from ports.
My card is a 5700xt.
Any help much appreciated.