GPUOpen-Drivers / AMDVLK

AMD Open Source Driver For Vulkan
MIT License
1.73k stars 162 forks source link

AMDVLK 2022.Q4.2 versus wgpu 0.12.0 and 0.14.0: [...Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left #305

Closed jokeyrhyme closed 1 year ago

jokeyrhyme commented 1 year ago

Howdie, I've run into a bit of a strange one, and I'm not sure if it's a bug or if it's a bug somewhere else in the stack (probably the latter)

I thought I'd record this here just so that others can find it, hopefully

I just updated my Archlinux system and noticed right away that onagre stopped working

However, it does work if I explicitly tell it to use RADV instead of AMDVLK, so I believe this has been broken by the new AMDVLK version (amdvlk 2022.Q4.2-1)

❯ AMD_VULKAN_ICD=RADV onagre
(works)
❯ AMD_VULKAN_ICD=AMDVLK onagre
[2022-11-10T21:49:55Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left
[2022-11-10T21:49:55Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left
Error: GraphicsAdapterNotFound

Other details of my system in case they are relevant:

❯ uname -a
Linux myhnegon 6.0.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 03 Nov 2022 18:01:58 +0000 x86_64 GNU/Linux

( cross-posted here: https://github.com/gfx-rs/wgpu/issues/3201 )

jokeyrhyme commented 1 year ago

wezterm just started using wgpu 0.14.0: https://github.com/wez/wezterm/blob/8479be746552db7d927a0dc8f31b4c5f751a2dfe/wezterm-gui/Cargo.toml#L102

has similar unexpected behaviour

Flakebi commented 1 year ago

I tested onagre on a RX 5700 XT (so different generation from your GPU!) and it worked fine with 2022.Q3.5 and the development branches from a few weeks ago (I had both of them already installed). But I got this output, also with radv:

[2022-11-21T10:35:12Z ERROR wgpu_hal::vulkan::instance] VALIDATION [UNASSIGNED-CoreValidation-Shader-InconsistentSpirv (0x6bbb14)]
        Validation Error: [ UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle = 0x5559211a8fa0, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 | SPIR-V module not valid: [VUID-StandaloneSpirv-Flat-06202] OpEntryPoint interfaces variable must not be vertex execution model with an input storage class for Entry Point id 46.
      %layer = OpVariable %_ptr_Input_int Input

[2022-11-21T10:35:12Z ERROR wgpu_hal::vulkan::instance]     objects: (type: DEVICE, hndl: 0x5559211a8fa0, name: ?)

Do I need to do anything special for wezterm to use Vulkan? I tried RUST_LOG=trace WGPU_BACKEND=vulkan …/wezterm-20221119-145034-49b9839f/bin/wezterm but it didn’t use my Vulkan driver as far as I can see. (I tried on wayland and NixOS in case that matters.)

For some background info, out-of-memory is an error that is always allowed to be returned, so the driver uses it as a kind of generic error return code. Usually it doesn’t have anything to do with running out of memory.

Edit: Now tested onagre on a RX 6700 XT with the dev driver and with 2022.Q4.2. I don’t have any plugins installed, so the window is empty, but it shows up without errors.

wez commented 1 year ago

Do I need to do anything special for wezterm to use Vulkan?

Launch using:

wezterm --config 'front_end="WebGpu"'

or otherwise put front_end="WebGpu" into the wezterm config file; see:

Flakebi commented 1 year ago

Thanks, I can run it with 2022.Q4.2 on a RX 6700 XT with NixOs/wayland. So, I wasn’t able to reproduce the bug in my environment :/

wez commented 1 year ago

Can you confirm which driver was used? Press CTRL-SHIFT-L to open the debug overlay and it should have a line that mentions the "OpenGL" driver that was selected; it should show something about WebGPU there

Flakebi commented 1 year ago
Debug Overlay
wezterm version: 20221119-145034-49b9839f
OpenGL version: WebGPU: name=AMD Radeon RX 6700 XT, device_type=DiscreteGpu, backend=Vulkan, driver=AMD open-source driver, driver_info=2022.Q4.2 (LLPC), vendor=4098, device=29663

I also got debug output from the Vulkan driver when I tested that :) Nothing that would hint at a bug though.

The only bug I noticed is that wezterm does not update the window when I type, only when I move the mouse. But I guess that’s a bug in wezterm (it happens with all drivers).

wez commented 1 year ago

The only bug I noticed is that wezterm does not update the window when I type, only when I move the mouse. But I guess that’s a bug in wezterm (it happens with all drivers).

That's very unusual. I have one report of something that sounds like this, but it's for X11:

jokeyrhyme commented 1 year ago

@Flakebi thanks for testing

Is your 6700XT the only GPU on your system? And where are your monitors plugged in?

I'm starting to think that maybe the problem is having 2x AMD GPUs and/or having no monitor plugged in to the GPU that AMDVLK is trying to use :shrug:

jokeyrhyme commented 1 year ago

Okay, I just tried with my monitor plugged into the AMD Ryzen 7950X integrated GPU instead of the Radeon 6800XT

That broke both RADV and AMDVLK onagre

And I continued to get the same error with AMDVLK wezterm, although RADV wezterm still works

My UEFI settings do not allow me to completely disable the integrated GPU (or the discrete GPU, for that matter), so the only remaining test case I can think of is:

jokeyrhyme commented 1 year ago

Okay, I have a little bit more detail (this is still with both GPUs installed and monitor plugged into 6800XT)

AMD_VULKAN_ICD=AMDVLK RUST_LOG=info onagre ``` [2022-11-23T18:04:51Z INFO wgpu_hal::vulkan::instance] Instance version: 0x4030eb [2022-11-23T18:04:51Z INFO wgpu_hal::vulkan::instance] Enabling device properties2 [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Loading Wayland library to get the current display [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Loading X11 library to get the current display [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Using Wayland platform [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Display vendor "Mesa Project", version (1, 5) [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL surface: +srgb [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Trying native-render [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL context: +robust access [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL context: +surfaceless [2022-11-23T18:04:51Z WARN wgpu_hal::gles::egl] Re-initializing Gles context due to Wayland window [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Display vendor "Mesa Project", version (1, 5) [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL surface: +srgb [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] Trying native-render [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL context: +robust access [2022-11-23T18:04:51Z INFO wgpu_hal::gles::egl] EGL context: +surfaceless [2022-11-23T18:04:51Z INFO wgpu_hal::gles::adapter] Vendor: AMD [2022-11-23T18:04:51Z INFO wgpu_hal::gles::adapter] Renderer: AMD Radeon RX 6800 XT (navi21, LLVM 14.0.6, DRM 3.48, 6.0.9-arch1-1) [2022-11-23T18:04:51Z INFO wgpu_hal::gles::adapter] Version: OpenGL ES 3.2 Mesa 22.2.3 [2022-11-23T18:04:51Z INFO wgpu_hal::gles::adapter] SL version: OpenGL ES GLSL ES 3.20 [2022-11-23T18:04:51Z INFO wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "Null hardware (RADV NAVI10)", vendor: 4098, device: 29456, device_type: DiscreteGpu, backend: Vulkan } [2022-11-23T18:04:51Z INFO wgpu_hal::vulkan::adapter] Private capabilities: PrivateCapabilities { flip_y_requires_shift: true, imageless_framebuffers: true, image_view_usage: true, timeline_semaphores: true, texture_d24: false, texture_d24_s8: false, can_present: true, non_coherent_map_mask: 63, robust_buffer_access: true, robust_image_access: true } [2022-11-23T18:04:51Z WARN wgpu_hal::vulkan] Unrecognized device error ERROR_INVALID_EXTERNAL_HANDLE [2022-11-23T18:04:51Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left [2022-11-23T18:04:51Z INFO wgpu_hal::vulkan::adapter] Private capabilities: PrivateCapabilities { flip_y_requires_shift: true, imageless_framebuffers: true, image_view_usage: true, timeline_semaphores: true, texture_d24: false, texture_d24_s8: false, can_present: true, non_coherent_map_mask: 63, robust_buffer_access: true, robust_image_access: true } [2022-11-23T18:04:51Z WARN wgpu_hal::vulkan] Unrecognized device error ERROR_INVALID_EXTERNAL_HANDLE [2022-11-23T18:04:51Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left [2022-11-23T18:04:51Z INFO wgpu_core::hub] Dropping Global Error: GraphicsAdapterNotFound ```
AMD_VULKAN_ICD=RADV RUST_LOG=info onagre ``` [2022-11-23T18:05:26Z INFO wgpu_hal::vulkan::instance] Instance version: 0x4030eb [2022-11-23T18:05:26Z INFO wgpu_hal::vulkan::instance] Enabling device properties2 [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Loading Wayland library to get the current display [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Loading X11 library to get the current display [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Using Wayland platform [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Display vendor "Mesa Project", version (1, 5) [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL surface: +srgb [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Trying native-render [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL context: +robust access [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL context: +surfaceless [2022-11-23T18:05:26Z WARN wgpu_hal::gles::egl] Re-initializing Gles context due to Wayland window [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Display vendor "Mesa Project", version (1, 5) [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL surface: +srgb [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] Trying native-render [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL context: +robust access [2022-11-23T18:05:26Z INFO wgpu_hal::gles::egl] EGL context: +surfaceless [2022-11-23T18:05:26Z INFO wgpu_hal::gles::adapter] Vendor: AMD [2022-11-23T18:05:26Z INFO wgpu_hal::gles::adapter] Renderer: AMD Radeon RX 6800 XT (navi21, LLVM 14.0.6, DRM 3.48, 6.0.9-arch1-1) [2022-11-23T18:05:26Z INFO wgpu_hal::gles::adapter] Version: OpenGL ES 3.2 Mesa 22.2.3 [2022-11-23T18:05:26Z INFO wgpu_hal::gles::adapter] SL version: OpenGL ES GLSL ES 3.20 [2022-11-23T18:05:26Z INFO wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "AMD Radeon RX 6800 XT (RADV NAVI21)", vendor: 4098, device: 29631, device_type: DiscreteGpu, backend: Vulkan } [2022-11-23T18:05:26Z INFO wgpu_hal::vulkan::adapter] Private capabilities: PrivateCapabilities { flip_y_requires_shift: true, imageless_framebuffers: true, image_view_usage: true, timeline_semaphores: true, texture_d24: false, texture_d24_s8: false, can_present: true, non_coherent_map_mask: 63, robust_buffer_access: true, robust_image_access: true } [2022-11-23T18:05:26Z INFO wgpu_core::device] Created buffer Valid((0, 1, Vulkan)) with BufferDescriptor { label: None, size: 64, usage: COPY_DST | UNIFORM, mapped_at_creation: true } ... (works) ```

The RUST_LOG environment variable doesn't yield more information with wezterm, so no further details there

Of note is something that I noticed here, too: https://github.com/wez/wezterm/issues/2756#issuecomment-1321562711

RADV (works):

wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "AMD Radeon RX 6800 XT (RADV NAVI21)", vendor: 4098, device: 29631, device_type: DiscreteGpu, backend: Vulkan }

AMDVLK (error):

wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "Null hardware (RADV NAVI10)", vendor: 4098, device: 29456, device_type: DiscreteGpu, backend: Vulkan }

Are the adapter "name" and "device" differences here significant?

wez commented 1 year ago

FWIW, wezterm uses WEZTERM_LOG to control its logger, rather than RUST_LOG.

jokeyrhyme commented 1 year ago

Okay, this is better, I've found a very reduced test case:

  1. git clone https://github.com/sotrh/learn-wgpu then cd code/beginner/tutorial2-surface
  2. AMD_VULKAN_ICD=AMDVLK RUST_LOG=info cargo run --bin tutorial2-surface

Interestingly, we get just a few more log lines that I haven't seen before:

[2022-11-24T08:23:39Z INFO  wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "Null hardware (RADV NAVI10)", vendor: 4098, device: 29456, device_type: DiscreteGpu, driver: "radv", driver_info: "Mesa 22.2.3", backend: Vulkan }
[2022-11-24T08:23:39Z INFO  wgpu_hal::vulkan::instance] GENERAL [Loader Message (0x0)]
        Failed to find vkGetDeviceProcAddr in layer "/usr/lib/amdvlk64.so"
[2022-11-24T08:23:39Z INFO  wgpu_hal::vulkan::instance]         objects: (type: INSTANCE, hndl: 0x55e04f47ca70, name: ?)
[2022-11-24T08:23:39Z INFO  wgpu_hal::vulkan::instance] GENERAL [Loader Message (0x0)]
               Using "Null hardware (RADV NAVI10)" with driver: "/usr/lib/libvulkan_radeon.so"

[2022-11-24T08:23:39Z INFO  wgpu_hal::vulkan::instance]         objects: (type: INSTANCE, hndl: 0x55e04f47ca70, name: ?)
[2022-11-24T08:23:39Z ERROR wgpu_hal::vulkan::instance] GENERAL [../mesa-22.2.3/src/vulkan/runtime/vk_semaphore.c:148 (0x0)]
        Combination of external handle types is unsupported for VkSemaphore creation. (VK_ERROR_INVALID_EXTERNAL_HANDLE)
[2022-11-24T08:23:39Z ERROR wgpu_hal::vulkan::instance]         objects: (type: DEVICE, hndl: 0x55e04f712590, name: ?)
[2022-11-24T08:23:39Z WARN  wgpu_hal::vulkan] Unrecognized device error ERROR_INVALID_EXTERNAL_HANDLE
[2022-11-24T08:23:39Z ERROR wgpu::backend::direct] Error in Adapter::request_device: not enough memory left
Flakebi commented 1 year ago

Someone managed to reproduce this problem (or a similar problem?), that manifests when the switchable graphics layer is active, but the amdvlk driver is not available. The symptom is a Null hardware (RADV NAVI10) like in your logs.

If this is the same problem, disabling the layer should workaround it (DISABLE_LAYER_AMD_SWITCHABLE_GRAPHICS_1=1).

In case this is not the problem, does vulkaninfo --summary work for you with amdvlk? (Should be in the vulkan-tools package.)

jokeyrhyme commented 1 year ago

@Flakebi

when the switchable graphics layer is active, but the amdvlk driver is not available

But, AMDVLK should be available, I have it installed and it was working with this exact hardware (AMD Ryzen 7950X with integrated AMD GPU + AMD Radeon RX 6800XT) until 2022.Q4.2 (or it's possible there was a kernel update around the same time, I'm currently on kernel 6.0.10-arch2-1)

AMD_VULKAN_ICD=AMDVLK vulkaninfo --summary:

WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 1.  Skipping ICD.
nushell: oops, process 'vulkaninfo' core dumped
AMD_VULKAN_ICD=RADV vulkaninfo --summary ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 1. Skipping ICD. ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.232 version 1 Devices: ======== GPU0: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT (RADV NAVI21) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 GPU1: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x164e deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Radeon Graphics (RADV GFX1036) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-6d00-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 ```
AMD_VULKAN_ICD=AMDVLK DISABLE_LAYER_AMD_SWITCHABLE_GRAPHICS_1=1 vulkaninfo --summary ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 1. Skipping ICD. ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.232 version 1 Devices: ======== GPU0: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT (RADV NAVI21) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 GPU1: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x164e deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Radeon Graphics (RADV GFX1036) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-6d00-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 ```
AMD_VULKAN_ICD=RADV DISABLE_LAYER_AMD_SWITCHABLE_GRAPHICS_1=1 vulkaninfo --summary ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 1. Skipping ICD. ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.232 version 1 Devices: ======== GPU0: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT (RADV NAVI21) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 GPU1: apiVersion = 4206816 (1.3.224) driverVersion = 92282883 (0x5802003) vendorID = 0x1002 deviceID = 0x164e deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Radeon Graphics (RADV GFX1036) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.2.3 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-6d00-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 ```

Oh, and trying this out with the original test:

I'm still seeing "driverName = radv" even when run with AMDVLK, is that normal? Do we think my system has an issue with AMDVLK, or the switching layer, or a combination of both?

jokeyrhyme commented 1 year ago

So, I switched to an older kernel

❯ uname -a
Linux myhnegon 5.15.80-1-lts #1 SMP Sat, 26 Nov 2022 20:23:30 +0000 x86_64 GNU/Linux
AMD_VULKAN_ICD=AMDVLK vulkaninfo --summary ``` ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.232 version 1 Devices: ======== GPU0: apiVersion = 4206824 (1.3.232) driverVersion = 8388855 (0x8000f7) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT driverID = DRIVER_ID_AMD_OPEN_SOURCE driverName = AMD open-source driver driverInfo = 2022.Q4.2 (LLPC) conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4c49-4e55-582d-445256000000 ```

Notice that this doesn't detect the integrated graphics at all, and we see "driverName = AMD open-source driver" (instead of "radv"

jokeyrhyme commented 1 year ago

Okay, updated to AMDVLK 2022.Q4.4: https://github.com/GPUOpen-Drivers/AMDVLK/releases/tag/v-2022.Q4.4

AMD_VULKAN_ICD=RADV vulkaninfo --summary (works) ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 0. Skipping ICD. ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.235 version 1 Devices: ======== GPU0: apiVersion = 4206822 (1.3.230) driverVersion = 92286977 (0x5803001) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT (RADV NAVI21) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.3.1 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 GPU1: apiVersion = 4206822 (1.3.230) driverVersion = 92286977 (0x5803001) vendorID = 0x1002 deviceID = 0x164e deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Radeon Graphics (RADV GFX1036) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.3.1 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-6d00-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 ```
AMD_VULKAN_ICD=AMDVLK vulkaninfo --summary (coredump) ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 0. Skipping ICD. help: Child process 'vulkaninfo' core dumped ```
AMD_VULKAN_ICD=AMDVLK DISABLE_LAYER_AMD_SWITCHABLE_GRAPHICS_1=1 vulkaninfo --summary (works) ``` WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 0. Skipping ICD. ========== VULKANINFO ========== Vulkan Instance Version: 1.3.235 Instance Extensions: count = 20 ------------------------------- VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 Instance Layers: count = 1 -------------------------- VK_LAYER_AMD_switchable_graphics_64 AMD switchable graphics layer 1.3.235 version 1 Devices: ======== GPU0: apiVersion = 4206822 (1.3.230) driverVersion = 92286977 (0x5803001) vendorID = 0x1002 deviceID = 0x73bf deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = AMD Radeon RX 6800 XT (RADV NAVI21) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.3.1 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0300-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 GPU1: apiVersion = 4206822 (1.3.230) driverVersion = 92286977 (0x5803001) vendorID = 0x1002 deviceID = 0x164e deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Radeon Graphics (RADV GFX1036) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 22.3.1 conformanceVersion = 1.3.0.0 deviceUUID = 00000000-6d00-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 ```

And separately, I updated my motherboard firmware to the latest version which provides a new option to completely disable the integrated GPU: when the AMD iGPU is disabled everything works perfectly

@Flakebi do you know if there is a better description we could rename this issue to in order to ensure it is properly triaged? Or if there's a separate place this issue needs to be reported?

Elinvention commented 1 year ago

Since I have an AMD Ryzen 5 7600X CPU, I had this problem too. I disabled the integrated GPU and now I can use AMDVLK with my dedicated GPU. Just having a gfx1036 makes the whole driver to fail even if there is a perfectly fine dedicated GPU that can be used. I tested with AMDVLK Q4.4.

jokeyrhyme commented 1 year ago

@Elinvention apparently the 2023 Q1 releases fix this: https://github.com/GPUOpen-Drivers/AMDVLK/releases/tag/v-2023.Q1.1

Add REMBRANDT, Raphael and Mendocino support

However, this hasn't been packaged for Archlinux yet, for some reason, which is still stuck on 2023 Q4

Elinvention commented 1 year ago

Whops I misinterpreted the version string. Indeed I was using 2022.Q4.4, but the latest version is 2023.Q1.3. Luckily nixpkgs merged 2023.Q1.3 yesterday (I use NixOS btw :laughing:).

jokeyrhyme commented 1 year ago

I can confirm that this issue is fixed for me after upgrading to 2023.Q2.1 https://github.com/GPUOpen-Drivers/AMDVLK/releases/tag/v-2023.Q2.1

Although, this was likely fixed in 2023.Q1.1 (version was not released for my distribution, so not tested by me, personally)

Given that this was sort of a duplicate, it's probably best to continue any further discussion over in #310