amshafer / nvidia-driver

Fork of the Nvidia FreeBSD driver to port the nvidia-drm.ko module from Linux
44 stars 5 forks source link

nvidia-driver (535.146.02) in parallel with drm-61-kmod #21

Closed dasTor closed 4 months ago

dasTor commented 7 months ago

Hi,

i have to run 15-current, to get my integrated Graphics (Alder Lake) to work. So i installed drm-61-kmod, but it fails to run with this repo.

I do have hybrid graphics and have already tried building freebsd/drm-kmod/ 6.1-lts branch from source, which runs fine. But when building this repo against it, i can load nvidia-drm, but any wayland compositor refuses to work. I can attach the Errors i get, if you want them later.

Is there anything i am missing, or is a nvidia-drm-61-kmod port planned or in the making?

Regards, Daniel

amshafer commented 7 months ago

Yes a nvidia-drm-61-kmod port is planned, shouldn't be too long and should happen whenever the nvidia-driver gets bumped to 550.

In the meantime you can follow the build instructions here to build it yourself. You'll have to also check out the matching drm-kmod tree at the 6.1-lts branch to match your installed port. You'll also have to manually apply these patches to get it to build. Sorry for the inconvenience, this will be taken care of in the future port improvements.

amshafer commented 7 months ago

Actually I just went ahead and pushed a version for nvidia-drm-61-kmod, so you can give that a try: https://reviews.freebsd.org/D43987

dasTor commented 7 months ago

thanks for your effort, unfortunately the port produces the same result for me: nvidia-modeset and nvidia are loading fine nvidia-drm is loading, but instantly freezing the system when for example used by the (sway)wm

in case you are interested, i have attached some hopefully useful logs - or should i write to freebsd-current?

dmesg.txt sway-i915kms.txt sway-nvidia.txt

amshafer commented 7 months ago

According to dmesg nvidia-drm.ko loads just fine.

Any other details you can provide? Is freezing on kernel module load, sway startup, etc? Is this on the laptop screen or an external monitor? I'm assuming you have the laptop in hybrid graphics mode and not NVIDIA-only mode?

Note there are still some rough edges for sway with NVIDIA. If you are using the default GL backend then there can be some ugly tearing, you'll have to disable hardware cursors, the Vulkan backend doesn't work unless you use the 550 NVIDIA driver (not yet in ports). Suspend/resume of any wayland apps on the NVIDIA GPU won't work either.

dasTor commented 7 months ago

i'm dualbooting with arch, so i know the nvidia hassle. these logs are all i have, as the complete system freezes, i watched /var/log/messages and /var/log/sway via ssh Bios is in Dual-Mode and sway + nvidia is doing fine with arch is there any other good way to get more usefull informations / debug logs?

amshafer commented 7 months ago

So freezes on sway startup? Are you checking that you can fully load into the system running without a desktop, then start sway manually and see it crash?

I would configure automatic kernel panicking, hopefully that will catch whatever is going wrong:

# auto crashdump
kern.coredump=1
debug.minidump=1
debug.debugger_on_panic=0
debug.kdb.break_to_debugger=0

Then ensure you have a swap partition you can dump to with dumpon -l and you should be set. I'm guessing it's a panic coming from drm-kmod or nvidia-drm, but hard to say without data.

dasTor commented 7 months ago

The order is: Booting (with i915kms enabled) - works Starting sway after boot - works Booting and then kldloading nvidia-drm - works Starting sway when no output is attached to the nvidia card - works since yesterday Starting sway when output is attached to nvidia - or running sway and then attaching output - panic

It took me a while to get the crashdumps working - where can i upload that, attach here? it's 950mb tar.gzipped I attached you some info and other files, maybe they already tell you what's wrong

sway-start.txt kldload-nvidia.txt dmesg-hdmi-plugged-in.txt info.txt core.txt - is empty core.1 - is too big to attach

amshafer commented 7 months ago

Thanks for all the details. The core file itself isn't going to be that useful for me, since I don't have your debug symbols. You can open it with kgdb and give me the backtrace though, that would be the most helpful. You can just post it in a comment here as an inline code block.

dasTor commented 7 months ago

Thanks for having a look at it

the inline code looks unformated, i also attached a txt kgdb.txt

(No debugging symbols found in /boot/modules/nvidia-drm.ko)
Reading symbols from /boot/modules/nvidia.ko...
(No debugging symbols found in /boot/modules/nvidia.ko)
Reading symbols from /boot/modules/nvidia-modeset.ko...
(No debugging symbols found in /boot/modules/nvidia-modeset.ko)
__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
57      __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) backtrace
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=1) at /usr/src/sys/kern/kern_shutdown.c:403
#2  0xffffffff80b53e50 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:521
#3  0xffffffff80b54352 in vpanic (fmt=0xffffffff811da3dc "%s", ap=ap@entry=0xfffffe0245e408c0)
    at /usr/src/sys/kern/kern_shutdown.c:973
#4  0xffffffff80b541a3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:889
#5  0xffffffff81059aaf in trap_fatal (frame=0xfffffe0245e409c0, eva=32)
    at /usr/src/sys/amd64/amd64/trap.c:950
#6  0xffffffff81059b5e in trap_pfault (frame=0xfffffe0245e409c0, usermode=false, signo=<optimized out>,
    ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:758
#7  <signal handler called>
#8  0xffffffff85549dd1 in nv_drm_gem_prime_import () from /boot/modules/nvidia-drm.ko
#9  0xffffffff85355744 in drm_gem_prime_fd_to_handle () from /boot/modules/drm.ko
#10 0xffffffff8534988d in drm_ioctl_kernel () from /boot/modules/drm.ko
#11 0xffffffff85349bf3 in drm_ioctl () from /boot/modules/drm.ko
#12 0xffffffff8554451b in nv_drm_ioctl () from /boot/modules/nvidia-drm.ko
#13 0xffffffff80de3756 in linux_file_ioctl_sub (fp=0x20, filp=0xfffff8005db80000, cmd=<optimized out>,
    data=<optimized out>, fop=<optimized out>, td=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:946
#14 linux_file_ioctl (fp=0x20, cmd=<optimized out>, data=<optimized out>, cred=<optimized out>,
    td=<optimized out>) at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:1570
#15 0xffffffff80bceec6 in fo_ioctl (fp=0xfffff80046bb86e0, com=3222037550, data=0xffffffff853851bb,
    active_cred=0xffffffff85551740 <nv_drm_fops>, td=0xfffff80a0136d000) at /usr/src/sys/sys/file.h:368
#16 kern_ioctl (td=td@entry=0xfffff80a0136d000, fd=81, com=com@entry=3222037550,
    data=0xffffffff853851bb "/usr/ports/graphics/drm-61-kmod/work/drm-kmod-drm_v6.1.69/drivers/gpu/drm/drm_prime.c", data@entry=0xfffffe0245e40d50 "") at /usr/src/sys/kern/sys_generic.c:804
#17 0xffffffff80bcebd3 in sys_ioctl (td=0xfffff80a0136d000, uap=0xfffff80a0136d400)
    at /usr/src/sys/kern/sys_generic.c:712
#18 0xffffffff8105a473 in syscallenter (td=0xfffff80a0136d000)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:186
#19 amd64_syscall (td=0xfffff80a0136d000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1192
#20 <signal handler called>
#21 0x0000000843acfcfa in ?? ()
Backtrace stopped: Cannot access memory at address 0x82032b168
(kgdb)
amshafer commented 7 months ago

I'm able to reproduce this, will take a look

amshafer commented 7 months ago

I have a fix for this panic, but the external monitor display does not work so I'm still looking into that. Afaict this isn't something I have tested yet, so this isn't a regression.

amshafer commented 4 months ago

The fix for this has been submitted to CURRENT: https://reviews.freebsd.org/D44306