Open therontarigo opened 1 year ago
Ah yes, this is because we need https://reviews.freebsd.org/D37611 which is atm only present in 14.0. I've reached out to see if we can get that merged.
Following through to https://github.com/freebsd/drm-kmod/pull/218/files
I find the given fix is already present in the drm-510-kmod I tested against.
(Interestingly checked against if __FreeBSD_version < 1301507
, suggesting the in-tree fix has already been MFC'd to 13-STABLE)
Ah just realized you're on release. Hm it does look like an issue with that function in drm-kmod then, I'll take a look
dma_map_sgtable is dereferencing a null sgt
- the problem occurs before this function.
Working on reproducing this, what version of Xorg are you running and does it contain https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1009? Also, what's your xorg.conf
?
That file renaming shouldn't change anything about Xorg functionality. Anyway, it is not present in the FreeBSD port. Xorg's PRIME offload is already working with nvidia-modeset without DRM.
As for xorg.conf, I have nothing that affects PRIME, Nvidia, or DRM: No xorg.conf, in xorg.conf.d I have handwritten intel.conf to specify Intel device (SNA/TearFree/BusID/DRI=3) and an input.conf for touchpad settings. Since perhaps the intel Driver is having some configuration effect on kernel DRM, here it is anyway:
Section "Device"
Option "AccelMethod" "sna"
Option "TearFree" "true"
Identifier "Card0"
Driver "intel"
Option "DRI" "3"
BusID "PCI:0:2:0"
EndSection
I suppose i915kms is a suspect here.
It's not just a file rename, it also enables the file (bits that interact with DRM for PRIME in the X server) for FreeBSD. You'll need it for prime X configs (such as those generated by nvidia-xconfig -prime
).
Good to know about the X config though, I'll give that a try.
Ah, I see now the line was moved from linux section to BSD section of the build file.
Now I must check whether /usr/local/lib/xorg/modules/drivers/nvidia_drv.so
and /usr/local/lib/xorg/modules/extensions/libglxserver_nvidia.so.1
are used at all in the Xorg GLX Nvidia offload I've been using, or /usr/local/lib/libGLX_nvidia.so.0
entirely bypasses any nvidia<->xorg interaction. I think the latter, Xorg being unaware of any nvidia modules in this configuration.
Curious that nvidia-drm presence breaks this in any way, when PRIME is not configured at all.
Xorg with intel video is apparently necessary to reproduce the panic:
nvidia, nvidia-modeset, nvidia-drm are loaded
Xorg :1 -config xorg-dummy.conf -configdir /dev/null
- no GPU interaction
Xorg :0
- using intel video driver
env DISPLAY=:1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxgears
-> works, renders on Nvidia
env DISPLAY=:0 __GLX_VENDOR_LIBRARY_NAME=nvidia glxgears
-> system hang, presumed to be the originally reported panic (drm system in some circumstances hangs the system instead of doing a crashdump, it is an unrelated bug)
xorg-dummy.conf
Section "ServerFlags"
Option "AutoAddDevices" "false"
EndSection
Section "Device"
Identifier "Card0"
Driver "dummy"
EndSection
Still curious what your Xorg version is. I'm assuming whatever is the latest package? If it isn't too much of a pain I would recommend building with that MR that I linked earlier.
Oh, I forgot. It is 21.1.4. If you insist, I can try the patch, but it shouldn't be necessary to reproduce the panic. If it only happens on my hardware - I'll try to dig into this myself. (To be clear, I'm more interested in solving the panic than to have Xorg+DRM+PRIME working any time soon - let's focus on kernel module stability before worrying about Xorg.)
I can reproduce a hang with vkcube but still working on getting an actual stack trace. Annoyingly my setup for PRIME doesn't seem to work out of the box when I moved back to 13.1-RELEASE from CURRENT. I asked about trying the patch mostly since I know top of tree Xorg works with it since I normally run that.
If you're feeling brave enough to poke around in kgdb to see if you can tell what's going wrong that would be helpful. Just from looking at your output it looks like for some reason __nv_drm_nvkms_gem_obj_init
gets called with a memory section that has no pages backing it (based on the dmesg warning). You could add a call to os_dump_stack
(which lives in src/nvidia/nvidia_os.c
iirc) to check where __nv_drm_nvkms_gem_obj_init
gets called from.
Ah okay, finally got a good repro of this.
Looking into this a little more I think this might be a non-bsd-spefic nvidia-drm bug. I found the following report which shows something similar that I'll look into: https://forums.developer.nvidia.com/t/wayland-nvidia-drm-desktop-freezes-when-playing-video-via-mpv-using-nvdec/215143
There also seems to be some issues with X properly auto-configuring secondary GPUS. Can you include your /var/log/Xorg.0.log
by chance?
This looks like it comes from a lack of nv_get_phys_pages
being implemented, that'll take a little but I am working on.
Can you please try with the new 525.78.01
branch? It should contain the needed fix. From testing on my end the issue goes away, so I feel reasonably confident you'll see the same.
Kernel: 13.1-RELEASE-p2 Hardware: GTX 960M, Intel HD 530 (SKL GT2) drm-510-kmod: built from ports
25bd187bcf5e
- the port uses GH_TAGNAMEdrm_v5.10.113_9
, which is identical to branch5.10-lts
. (withMAKE_ENV+=DEBUG_FLAGS=-g
) amshafer/nvidia-driver/nvidia built withmake DRMKMODDIR=/usr/ports/graphics/drm-510-kmod/work/drm-kmod-drm_v5.10.113_9/
kldload src/nvidia/nvidia.ko
kldload src/nvidia-modeset/nvidia-modeset.ko
start Xorg, test
env __GLX_VENDOR_LIBRARY_NAME=nvidia glxgears -info
-> works, uses Nvidiaquit Xorg,
kldload src/nvidia-drm/nvidia-drm.ko
start Xorg, testenv __GLX_VENDOR_LIBRARY_NAME=nvidia glxgears -info
-> kernel panic Same panic results when testing Vulkan such asvkcube-xlib
/var/crash/core.txt
relevant excerpt