NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source
Other
15.14k stars 1.26k forks source link

Nvidia failed on vkcube-wayland #317

Open Decodetalkers opened 2 years ago

Decodetalkers commented 2 years ago

NVIDIA Open GPU Kernel Modules Version

515.57-3

Does this happen with the proprietary driver (of the same version) as well?

Yes

Operating System and Version

Archlinux

Kernel Release

Linux 5.18.9-arch1-1

Hardware: GPU

GPU 0: NVIDIA GeForce MX450 (UUID: GPU-4bf8bfd8-7a13-3a35-4e11-3941a729e559)

Describe the bug

I use sway with nvidia geforce 450 and intel TigerLake , two gpu. and vulkan-wayland always coredump when ran with nvidia.

I ran vkcube-wayland with nvidia, it report coredump

MESA-INTEL: warning: Performance support disabled, consider sysctl dev.i915.perf_stream_paranoid=0
Selected GPU 1: NVIDIA GeForce MX450, type: DiscreteGpu
zsh: segmentation fault (core dumped)  vkcube-wayland

To Reproduce

run the vkcube-wayland on nvidia

And Maybe should have two GPU.

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

niv commented 2 years ago

Hello,

thanks for the report. Do you know if this was introduced with 515.57, or was this happening on earlier drivers too?

Decodetalkers commented 2 years ago

Hello,

thanks for the report. Do you know if this was introduced with 515.57, or was this happening on earlier drivers too?

On my machine, it never work..

niv commented 2 years ago

Thanks. Tracking internally in bug 3707172.

CoelacanthusHex commented 2 years ago

backtrace

Thread 1 (Thread 0x7ffff7c1c240 (LWP 80072) "vkcube-wayland"):
#0  0x00007ffff5be8143 in ?? () from /usr/lib/libnvidia-glcore.so.515.65.01
#1  0x000055555555c90a in ?? ()
#2  0x0000555555559ab9 in ?? ()
#3  0x00007ffff7c4c2d0 in __libc_start_call_main (main=main@entry=0x555555558020, argc=argc@entry=3, argv=argv@entry=0x7fffffff88e8) at ../sysdeps/nptl/libc_start_call_main.h:58
#4  0x00007ffff7c4c38a in __libc_start_main_impl (main=0x555555558020, argc=3, argv=0x7fffffff88e8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffff88d8) at ../csu/libc-start.c:381
#5  0x000055555555b985 in ?? ()
kanashimia commented 2 years ago

This segfault is be because vkCreateSwapchainKHR returns VK_ERROR_INITIALIZATION_FAILED, so this is a dupe of https://github.com/NVIDIA/open-gpu-kernel-modules/issues/354 When you build vkcube-wayland in a debug mode you can see it

›./cube/vkcube-wayland 
Selected GPU 0: NVIDIA GeForce GTX 1050 Ti, type: DiscreteGpu
vkcube-wayland: /home/kanashimia/vulkan-tools/cube/cube.c:1373: demo_prepare_buffers: Assertion `!err' failed.
Aborted (core dumped)

lines 1372-1373:

err = demo->fpCreateSwapchainKHR(demo->device, &swapchain_ci, NULL, &demo->swapchain);
assert(!err);

Same happends with all vulkan applications that i've tried, only when surface is a wayland surface, same both on fedora and nixos, both on mutter (gnome) and sway, tested multiple nvidia module versions, so this issue was from the beginning of wayland support.

Reportedly this only happends when you have dual GPUs, i asked a person that only has nvidia, and he doesen't seem to have this problem. When you force VK_ICD_FILENAMES to contain only intels ICD then it works obviously.

Also interestingly when you run with __NV_PRIME_RENDER_OFFLOAD=1 then it doesn't crash, but still doesn't work and shows this:

›__NV_PRIME_RENDER_OFFLOAD=1 ./cube/vkcube-wayland
Selected GPU 0: NVIDIA GeForce GTX 1050 Ti, type: DiscreteGpu
[destroyed object]: error 7: importing the supplied dmabufs failed

It doen't exit, just hangs like this.

rakhenmanoa commented 2 years ago

Hello. I think nvidia drivers don't support EGL offloading yet.

CoelacanthusHex commented 1 year ago

It still happened in 525.60.11

PedroHLC commented 1 year ago

It still happened in 525.60.11

This situation described in a 9mo old Reddit post hasn't changed so far:

Nvidia's EGL implementation doesn't implement prime render offload and the Vulkan layer doesn't work with Wayland. So there's no way for Wayland native apps to use prime render offload. Should work with Xwayland just fine. https://www.reddit.com/r/archlinux/comments/svoi6l/comment/hxhtdjq/

Last time I checked the code, the open driver even supported fewer surfaces/modifiers than the proprietary one.

Decodetalkers commented 1 year ago

Thanks. Tracking internally in bug 3707172.

Any update now..I think many people have problem about the bug

vasishath commented 1 year ago

@niv any updates on this? I think this is the last remaining major bug preventing optimus usage on wayland.

Tim-Paik commented 1 year ago

any updates?

rakhenmanoa commented 1 year ago

Still nothing

Le ven. 3 mars 2023, 17:47, Tim_Paik @.***> a écrit :

any updates?

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/open-gpu-kernel-modules/issues/317#issuecomment-1453639143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5PXGM37CXC2NQ3F2XBSMLW2H77ZANCNFSM52UTPGUQ . You are receiving this because you commented.Message ID: @.***>

jp7677 commented 1 year ago

May be the discussion at https://github.com/negativo17/nvidia-driver/issues/131 is relevant, though not sure since this is about the proprietary driver.

vasishath commented 1 year ago

May be the discussion at negativo17/nvidia-driver#131 is relevant, though not sure since this is about the proprietary driver.

I don't think so since this issue is only with PRIME setup.

marious1985 commented 1 year ago

I have the same problem, laptop Pavilion HP 15-ec2014ns amd igpu + dgpu nvidia gforce gtx 1650 with propietary drivers and prime. All works on Xorg, in wayland no.

mario@laptop:~$ vkcube-wayland --gpu_number 1 Selected GPU 1: NVIDIA GeForce GTX 1650, type: DiscreteGpu vkcube-wayland: ./cube/cube.c:1403: demo_prepare_buffers: Assertion `!err' failed. Aborted (core dumped)

thecoder08 commented 1 year ago

Same issue here. does not work with Nvidia drivers 525.105.17

zeroxoneafour commented 1 year ago

Have tested this on a laptop with a mux switch and hybrid graphics. Vulkan works with only the intel gpu and only the nvidia gpu, but not in optimus.