NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source
Other
15.02k stars 1.24k forks source link

gamescope: vkCreateDevice failed #140

Open gnusenpai opened 2 years ago

gnusenpai commented 2 years ago

NVIDIA Driver Version 515.43.04

GPU GeForce RTX 3080 GeForce GTX 1070 Ti (it is bound by vfio-pci so it shouldn't be used)

Describe the bug Vulkan works as you would expect (vkcube and Vulkan games work well), but gamescope does not launch with kernel-open. It appears that something is wrong with the recently added DRM format modifier extension interacting with kernel-open (maybe it is not implemented?):

vulkan: selecting physical device 'NVIDIA GeForce RTX 3080'
vulkan: physical device supports DRM format modifiers
vulkan: vkCreateDevice failed (VkResult: -3)
Failed to initialize Vulkan

To Reproduce Launch gamescope in a shell with no arguments using the kernel-open kernel module.

Expected behavior With the proprietary kernel module, gamescope launches as expected:

vulkan: selecting physical device 'NVIDIA GeForce RTX 3080'
vulkan: physical device supports DRM format modifiers
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x3231564E (VkResult: 0)
vulkan: supported DRM formats for sampling usage:
vulkan:   0x34325241
vulkan:   0x34325258
wlserver: [backend/headless/backend.c:82] Creating headless backend
wlserver: Running compositor on wayland display 'gamescope-0'
wlserver: [backend/headless/backend.c:18] Starting headless backend
wlserver: [xwayland/server.c:92] Starting Xwayland on :1
wlserver: [types/wlr_surface.c:741] New wlr_surface 0x5639f92e2fc0 (res 0x5639f9c9ed60)
wlserver: [xwayland/server.c:250] Xserver is ready
pipewire: stream state changed: connecting
pipewire: stream state changed: paused
pipewire: stream available on node ID: 98
pipewire: renegotiating stream params (size: 1280x720)

Please reproduce the problem, run nvidia-bug-report.sh, and attach the resulting nvidia-bug-report.log.gz. nvidia-bug-report.log.gz

amrit1711 commented 2 years ago

Thanks for reporting issue, we will have a look and update.

gnusenpai commented 2 years ago

After some additional testing, I've sort of narrowed down the issue. gamescope can work on kernel-open, but only if /usr/bin/gamescope does not have the CAP_SYS_NICE capability set. I also tested with the proprietary module and it works with or without CAP_SYS_NICE, so there is still some discrepancy between the proprietary module and kernel-open here.

I don't have any insight as to why this would happen, but I think this is still relevant to resolve. gamescope will warn the user if the capability isn't set with the following:

No CAP_SYS_NICE, falling back to regular-priority compute and threads.
Performance will be affected.

I'm not super-familiar with how capabilities work, so this could be a gamescope issue.

somewhatfrog commented 2 years ago

edit: sorry, i posted with wrong issue

amrit1711 commented 2 years ago

We have internally filed a bug 3674393 for tracking purpose. Shall keep updated on it.

ThisNekoGuy commented 2 years ago

Did a test of this with driver version 515.65.01 using Minetest as the subject; it creates a window now but only sometimes and other times creates a transparent window; putting this here as an up-to-date example:

Terminal Output: ``` neko-san@ARCH ~> gamescope --nested-width=1920 --nested-height=1080 --fsr-upscaling --fsr-sharpness=0 --fullscreen --hide-cursor-delay -- minetest No CAP_SYS_NICE, falling back to regular-priority compute and threads. Performance will be affected. wlserver: [backend/headless/backend.c:82] Creating headless backend vulkan: selecting physical device 'NVIDIA GeForce RTX 2080 Ti': queue family 2 vulkan: physical device supports DRM format modifiers vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x3231564E (VkResult: 0) vulkan: supported DRM formats for sampling usage: vulkan: 0x34325241 vulkan: 0x34325258 wlserver: Running compositor on wayland display 'gamescope-0' wlserver: [backend/headless/backend.c:18] Starting headless backend wlserver: [xwayland/sockets.c:63] Failed to bind socket @/tmp/.X11-unix/X0: Address already in use wlserver: [xwayland/server.c:92] Starting Xwayland on :1 wlserver: [types/wlr_surface.c:748] New wlr_surface 0x55af3fafdf10 (res 0x55af3faf42f0) wlserver: [xwayland/server.c:250] Xserver is ready pipewire: stream state changed: connecting pipewire: stream state changed: paused pipewire: stream available on node ID: 53 pipewire: renegotiating stream params (size: 3840x2160) wlserver: [types/wlr_surface.c:748] New wlr_surface 0x55af3facd770 (res 0x55af3fb0be40) xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 171 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 172 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 173 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 174 xwm: error 141: 141 request 138 minor 32 serial 191 xwm: error 141: 141 request 138 minor 32 serial 192 xwm: error 141: 141 request 138 minor 32 serial 193 xwm: error 141: 141 request 138 minor 32 serial 194 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 195 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 196 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 197 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 198 xwm: error 141: 141 request 138 minor 32 serial 212 xwm: error 141: 141 request 138 minor 32 serial 213 xwm: error 141: 141 request 138 minor 32 serial 214 xwm: error 141: 141 request 138 minor 32 serial 215 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 216 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 217 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 218 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 219 xwm: error 141: 141 request 138 minor 32 serial 228 xwm: error 141: 141 request 138 minor 32 serial 229 xwm: error 141: 141 request 138 minor 32 serial 230 xwm: error 141: 141 request 138 minor 32 serial 231 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 232 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 233 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 234 xwm: error 2: BadValue (integer parameter out of range for operation) request 138 minor 31 serial 235 pipewire: renegotiating stream params (size: 3840x2160) ^Cgamescope: received kill signal, terminating! xwm: Lost connection to the X11 server 0 2022-08-23 18:14:58: [Main]: INFO: signal_handler(): Ctrl-C pressed, shutting down.xwm : X11 I/O error X connection to :1 broken (explicit kill or server shutdown). fish: Job 1, 'gamescope --nested-width=1920 -…' terminated by signal SIGSEGV (Address boundary error) ```
System Info: OS: Arch Linux x86_64 Kernel: [Linux 5.17.2-256-tkg-pds](https://github.com/frogging-family/linux-tkg) CPU: AMD Ryzen 7 3700X (16) @ 4.05GHz GPU: RTX 2080 Ti GPU Driver: 515.65.01 Desktop Environment: Plasma 5.25.4
cronyakatsuki commented 1 year ago

After some additional testing, I've sort of narrowed down the issue. gamescope can work on kernel-open, but only if /usr/bin/gamescope does not have the CAP_SYS_NICE capability set. I also tested with the proprietary module and it works with or without CAP_SYS_NICE, so there is still some discrepancy between the proprietary module and kernel-open here.

I don't have any insight as to why this would happen, but I think this is still relevant to resolve. gamescope will warn the user if the capability isn't set with the following:

No CAP_SYS_NICE, falling back to regular-priority compute and threads.
Performance will be affected.

I'm not super-familiar with how capabilities work, so this could be a gamescope issue.

I can confirm that this is the cause, before adding capability and was able to run gamescope, but after adding it I wasn't able to run it using nvidia-open drivers version 525.60.11 on Arch Linux with gtx 1650 mobile.

unhappy-ending commented 10 months ago

I'd like to confirm I have this same issue on a Gentoo system. Proprietary works fine, gpu-open fails. AFAIK I compiled gamescope without filecaps and still had the issue, but switching to proprietary fixed it.

Username404-59 commented 10 months ago

I'd like to confirm I have this same issue on a Gentoo system. Proprietary works fine, gpu-open fails. AFAIK I compiled gamescope without filecaps and still had the issue, but switching to proprietary fixed it.

As previously stated, you can make it work on nvidia's open kernel modules by removing cap_sys_nice: sudo setcap cap_sys_nice= /usr/bin/gamescope

ThisNekoGuy commented 10 months ago

I'd like to confirm I have this same issue on a Gentoo system. Proprietary works fine, gpu-open fails. AFAIK I compiled gamescope without filecaps and still had the issue, but switching to proprietary fixed it.

As previously stated, you can make it work on nvidia's open kernel modules by removing cap_sys_nice: sudo setcap cap_sys_nice= /usr/bin/gamescope

I actually just tried this and it, in fact, "does" work but it's busted:

user@GENTOO ~> gamescope --nested-width=1920 --nested-height=1080 --hide-cursor-delay --nested-refresh=60 -- dolphin
No CAP_SYS_NICE, falling back to regular-priority compute and threads.
Performance will be affected.
wlserver: [backend/headless/backend.c:68] Creating headless backend
vulkan: selecting physical device 'NVIDIA GeForce RTX 2080 Ti': queue family 2 (general queue family 0)
vulkan: physical device supports DRM format modifiers
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x3231564E (VkResult: 0)
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344241 (VkResult: 0)
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344258 (VkResult: 0)
vulkan: supported DRM formats for sampling usage:
vulkan:   AR24 (0x34325241)
vulkan:   XR24 (0x34325258)
vulkan:   AB24 (0x34324241)
vulkan:   XB24 (0x34324258)
vulkan:   RG16 (0x36314752)
vulkan:   AB4H (0x48344241)
vulkan:   XB4H (0x48344258)
vulkan:   AB30 (0x30334241)
vulkan:   XB30 (0x30334258)
vulkan:   AR30 (0x30335241)
vulkan:   XR30 (0x30335258)
vulkan: Creating Gamescope nested swapchain with format 44 and colorspace 0
wlserver: Running compositor on wayland display 'gamescope-0'
wlserver: [backend/headless/backend.c:16] Starting headless backend
wlserver: [xwayland/sockets.c:65] Failed to bind socket @/tmp/.X11-unix/X0: Address already in use
wlserver: [util/env.c:9] Loading WLR_NO_HARDWARE_CURSORS option: 1
wlserver: [types/output/output.c:382] WLR_NO_HARDWARE_CURSORS set, forcing software cursors
wlserver: [xwayland/server.c:108] Starting Xwayland on :1
wlserver: [types/wlr_compositor.c:673] New wlr_surface 0x561d8f32ec50 (res 0x561d8fafb780)
wlserver: [xwayland/server.c:273] Xserver is ready
pipewire: stream state changed: connecting
pipewire: stream state changed: paused
pipewire: stream available on node ID: 67
vulkan: Creating Gamescope nested swapchain with format 44 and colorspace 0
pipewire: renegotiating stream params (size: 1280x720)
kf.service.services: The desktop entry file "/usr/share/applications/org.freedesktop.Xwayland.desktop" has Type= "Application" but no Exec line
kf.service.sycoca: Invalid Service :  "/usr/share/applications/org.freedesktop.Xwayland.desktop"
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_ABOVE
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_STAYS_ON_TOP
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_BELOW
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_MAXIMIZED_HORZ
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_MAXIMIZED_VERT
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_MAXIMIZED_HORZ
xwm: Unhandled NET_WM_STATE property change: _NET_WM_STATE_MAXIMIZED_VERT
wlserver: [types/wlr_compositor.c:673] New wlr_surface 0x561d8fad4f70 (res 0x561d8faff270)
xwm: Unhandled initial NET_WM_STATE property: _NET_WM_STATE_MAXIMIZED_HORZ
xwm: Unhandled initial NET_WM_STATE property: _NET_WM_STATE_MAXIMIZED_VERT
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Warning:          Unsupported maximum keycode 708, clipping.
>                   X11 cannot support keycodes above 255.
Errors from xkbcomp are not fatal to the X server
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap
QPixmap::scaled: Pixmap is a null pixmap

Screenshot_20231113_042905

unhappy-ending commented 9 months ago

As previously stated, you can make it work on nvidia's open kernel modules by removing cap_sys_nice: sudo setcap cap_sys_nice= /usr/bin/gamescope

Indeed, but the point was proprietary doesn't have this issue while the open gpu kernel module does. I don't know about you but I expect they would both behave the same for this?

I actually just tried this and it, in fact, "does" work but it's busted:

@ThisNekoGuy Same, I still get the No CAP_SYS_NICE, falling back to regular-priority compute and threads. error which doesn't happen on proprietary. So the setcaps workaround doesn't even work correctly, as you stated.

Also, I seem to have no problem using gamescope to run native wayland apps but anything X related gives me a black screen. I also get a segfault when trying ENABLE_VKBASALT=1 which doesn't happen with something like vkcube.

I tried the exact command you did but with a perfectly working dolphin and with different complaints in the terminal:

gamescope --nested-width=1920 --nested-height=1080 --hide-cursor-delay --nested-refresh=60 -- dolphin
No CAP_SYS_NICE, falling back to regular-priority compute and threads.
Performance will be affected.
Your Wayland compositor does NOT support wp_presentation/presentation-time which is required for VK_KHR_present_wait and VK_KHR_present_id.
Please complain to your compositor vendor for support. Falling back to X11 window with less accurate present wait.
wlserver: [backend/headless/backend.c:68] Creating headless backend
vulkan: selecting physical device 'NVIDIA GeForce RTX 3070': queue family 2 (general queue family 0)
vulkan: physical device supports DRM format modifiers
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x3231564E (VkResult: 0)
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344241 (VkResult: 0)
vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344258 (VkResult: 0)
vulkan: supported DRM formats for sampling usage:
vulkan:   AR24 (0x34325241)
vulkan:   XR24 (0x34325258)
vulkan:   AB24 (0x34324241)
vulkan:   XB24 (0x34324258)
vulkan:   RG16 (0x36314752)
vulkan:   AB4H (0x48344241)
vulkan:   XB4H (0x48344258)
vulkan:   AB30 (0x30334241)
vulkan:   XB30 (0x30334258)
vulkan:   AR30 (0x30335241)
vulkan:   XR30 (0x30335258)
vulkan: Creating Gamescope nested swapchain with format 44 and colorspace 0
wlserver: Running compositor on wayland display 'gamescope-0'
wlserver: [backend/headless/backend.c:16] Starting headless backend
wlserver: [xwayland/server.c:108] Starting Xwayland on :1
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Warning:          Could not resolve keysym XF86CameraAccessEnable
> Warning:          Could not resolve keysym XF86CameraAccessDisable
> Warning:          Could not resolve keysym XF86CameraAccessToggle
> Warning:          Could not resolve keysym XF86NextElement
> Warning:          Could not resolve keysym XF86PreviousElement
> Warning:          Could not resolve keysym XF86AutopilotEngageToggle
> Warning:          Could not resolve keysym XF86MarkWaypoint
> Warning:          Could not resolve keysym XF86Sos
> Warning:          Could not resolve keysym XF86NavChart
> Warning:          Could not resolve keysym XF86FishingChart
> Warning:          Could not resolve keysym XF86SingleRangeRadar
> Warning:          Could not resolve keysym XF86DualRangeRadar
> Warning:          Could not resolve keysym XF86RadarOverlay
> Warning:          Could not resolve keysym XF86TraditionalSonar
> Warning:          Could not resolve keysym XF86ClearvuSonar
> Warning:          Could not resolve keysym XF86SidevuSonar
> Warning:          Could not resolve keysym XF86NavInfo
Errors from xkbcomp are not fatal to the X server
wlserver: [types/wlr_compositor.c:673] New wlr_surface 0x5b5655d23370 (res 0x5b5655d2a2c0)
wlserver: [xwayland/server.c:273] Xserver is ready
pipewire: stream state changed: connecting
pipewire: stream state changed: paused
pipewire: stream available on node ID: 49
vulkan: Creating Gamescope nested swapchain with format 44 and colorspace 0
pipewire: renegotiating stream params (size: 1280x720)
gamescope: children shut down!
(EE) failed to read Wayland events: Broken pipe

I'm using Nvidia on KDE Plasma wayland session.

Username404-59 commented 9 months ago

I still get the No CAP_SYS_NICE, falling back to regular-priority compute and threads. error

This is a warning, not an error

Indeed, but the point was proprietary doesn't have this issue while the open gpu kernel module does. I don't know about you but I expect they would both behave the same for this?

Yes, I only provided a workaround which worked for me weeks ago. Now I'm having the same issues as you e.g black screens

unhappy-ending commented 9 months ago

This is a warning, not an error

Right, and I'm still getting the warning on proprietary but not needing to setcaps first. I didn't notice the error was still there on proprietary while testing, was too focused on other things at the time. OTOH, doesn't the warning indicate something isn't working regardless? I do have filecaps built into gamescope.

Yes, I only provided a workaround which worked for me weeks ago. Now I'm having the same issues as you e.g black screens

Whew! I'm glad it's not just me, I tried multiple build options between gamescope, wlroots, and making sure I had vulkan-layers and so on and couldn't get around the problem for the black screen for Xwayland.

unhappy-ending commented 9 months ago

Just updating, the 535.43.20 vulkan beta branch fixed the black screen issue for me. However, the open gpu kernel module is still not working right with the same issues as before, requiring setcap command prior to running.