ValveSoftware / gamescope

SteamOS session compositing window manager
Other
2.99k stars 198 forks source link

Mere existence of alternative AMD Vulkan drivers is causing Gamescope to fail. #1465

Open CNR0706 opened 1 month ago

CNR0706 commented 1 month ago

System Report: https://gist.github.com/CNR0706/ddbeb3289f67d5c46a033a2def05f9c0 System Information: https://gist.github.com/CNR0706/0f1d02bf3f429fb9e2cc4b2e77bc4ebf Steam Runtime Diagnostics: https://gist.github.com/CNR0706/fb18e3b4ad0596fd8b9efc6b919f1900

When running Gamescope on a system that has AMDGPU-Pro's Vulkan driver or AMDVLK installed it fails to start because it is trying to make use of these drivers even when RADV is explicitly forced using $VK_DRIVER_FILES, $VK_ICD_FILENAMES or $AMD_VULKAN_ICD.

Console output:

~
❯ gamescope -- glxgears
console: gamescope version undefined
ATTENTION: default value of option vk_khr_present_wait overridden by environment.
ATTENTION: default value of option vk_khr_present_wait overridden by environment.
vulkan: selecting physical device 'AMD Radeon RX 6700 XT': queue family 1 (general queue family 0)
vulkan: physical device supports DRM format modifiers
vulkan: vkAllocateDescriptorSets failed
SDL_Vulkan_CreateSurface failed: VK_KHR_wayland_surface extension is not enabled in the Vulkan instance.terminate called without an active exception
[1]    26141 IOT instruction  gamescope -- glxgears

~
❯

It's sometimes possible to work around this by specifying --prefer-vk-device, but this seems to be dependent on the way the drivers are packaged. AMDVLK for example puts its ICD files into /etc/vulkan/icd.d and this breaks the workaround entirely for some reason. Moving them manually to /usr/share/vulkan/icd.d makes the workaround workaround again.

AMDGPU-Pro's Vulkan implementation puts its files into /usr/share/vulkan/icd.d thus the workaround works normally.

Console output with workaround:

~
❯ gamescope --prefer-vk-device 1002:73df -- glxgears
console: gamescope version undefined
ATTENTION: default value of option vk_khr_present_wait overridden by environment.
ATTENTION: default value of option vk_khr_present_wait overridden by environment.
vulkan: selecting physical device 'AMD Radeon RX 6700 XT (RADV NAVI22)': queue family 1 (general queue family 0)
vulkan: physical device supports DRM format modifiers
wlserver: [backend/headless/backend.c:67] Creating headless backend
xdg_backend: Seat name: seat0
vulkan: supported DRM formats for sampling usage:
vulkan:   AR24 (0x34325241)
vulkan:   XR24 (0x34325258)
vulkan:   AB24 (0x34324241)
vulkan:   XB24 (0x34324258)
vulkan:   RG16 (0x36314752)
vulkan:   NV12 (0x3231564E)
vulkan:   AB4H (0x48344241)
vulkan:   XB4H (0x48344258)
vulkan:   AB48 (0x38344241)
vulkan:   XB48 (0x38344258)
vulkan:   AB30 (0x30334241)
vulkan:   XB30 (0x30334258)
vulkan:   AR30 (0x30335241)
vulkan:   XR30 (0x30335258)
wlserver: Running compositor on wayland display 'gamescope-0'
wlserver: [backend/headless/backend.c:17] Starting headless backend
wlserver: Gamescope built without libei, XTEST will not be available!
wlserver: [xwayland/server.c:107] Starting Xwayland on :1
wlserver: [types/wlr_compositor.c:771] New wlr_surface 0x5641aa3f2df0 (res 0x5641aa3f36e0)
wlserver: [xwayland/server.c:272] Xserver is ready
pipewire: stream state changed: connecting
pipewire: stream state changed: paused
pipewire: stream available on node ID: 125
xwm: Embedded, no cursor set. Using left_ptr by default.
vblank: Using timerfd.
josh edid: Patching res 800x1280 -> 1280x720
pipewire: renegotiating stream params (size: 960x527)
wlserver: [types/wlr_compositor.c:771] New wlr_surface 0x5641aa66e770 (res 0x5641aa3f4480)
xwm: got the same buffer committed twice, ignoring.
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Warning:          Unsupported maximum keycode 708, clipping.
>                   X11 cannot support keycodes above 255.
> Warning:          Could not resolve keysym XF86KbdInputAssistPrevgrou
> Warning:          Could not resolve keysym XF86KbdInputAssistNextgrou
Errors from xkbcomp are not fatal to the X server
pipewire: renegotiating stream params (size: 952x482)
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
xdg_backend: Changed refresh to: 164.834hz
xdg_backend: Compositor released us but we were not acquired. Oh no.
827 frames in 5.0 seconds = 165.024 FPS
gamescope: children shut down!
(EE) failed to read Wayland events: Broken pipe

~ took 5s
❯

I've been able to reproduce this issue on both Gentoo ~amd64 and openSuSE Tumbleweed.

krazyguy97 commented 1 month ago

same issue here, had installed amdgpu-pro for emulation purposes and this started happening.

STrusov commented 2 weeks ago

Thank you for the workaround. It does not work in my case (perhaps because Vulkan Validation Layers are installed), however pointed to the sourcecode, so here is a quick hack to select the device by the highest API version.

diff --git a/src/rendervulkan.cpp b/src/rendervulkan.cpp
index 63d34b3..a0cec50 100644
--- a/src/rendervulkan.cpp
+++ b/src/rendervulkan.cpp
@@ -325,10 +325,15 @@ bool CVulkanDevice::selectPhysDev(VkSurfaceKHR surface)
        bTryComputeOnly = false;
    }

+   uint32_t apiVersion = 0;
    for (auto cphysDev : physDevs)
    {
        VkPhysicalDeviceProperties deviceProperties;
        vk.GetPhysicalDeviceProperties(cphysDev, &deviceProperties);
+       vk_log.infof( "physical device '%s': API %u.%u.%u", deviceProperties.deviceName,
+           VK_VERSION_MAJOR(deviceProperties.apiVersion),
+           VK_VERSION_MINOR(deviceProperties.apiVersion),
+           VK_VERSION_PATCH(deviceProperties.apiVersion));

        if (deviceProperties.apiVersion < VK_API_VERSION_1_2)
            continue;
@@ -350,9 +355,11 @@ bool CVulkanDevice::selectPhysDev(VkSurfaceKHR surface)

        if (generalIndex != ~0u || computeOnlyIndex != ~0u)
        {
-           // Select the device if it's the first one or the preferred one
-           if (!m_physDev ||
-               (g_preferVendorID == deviceProperties.vendorID && g_preferDeviceID == deviceProperties.deviceID))
+           // Search for the highest API version.
+           // However, ignore other devices if the preferred one provided.
+           if (apiVersion < deviceProperties.apiVersion &&
+               (!g_preferDeviceID ||
+               (g_preferVendorID == deviceProperties.vendorID && g_preferDeviceID == deviceProperties.deviceID)))
            {
                // if we have a surface, check that the queue family can actually present on it
                if (surface) {
@@ -380,6 +387,8 @@ bool CVulkanDevice::selectPhysDev(VkSurfaceKHR surface)

                if ( env_to_bool( getenv( "GAMESCOPE_FORCE_GENERAL_QUEUE" ) ) )
                    m_queueFamily = generalIndex;
+
+               apiVersion = deviceProperties.apiVersion;
            }
        }
    }

It seems to work with and without --prefer-vk-device specified. Please note, I am not considering this as solution, because it depends on Mesa version and shall occasionally select something wrong.

Just in case: to apply this on Gentoo save as a /etc/portage/patches/gui-wm/gamescope/search_highest_api.patch so emerge gamescope will pick it up.