Closed vincentbaeten closed 1 year ago
I am using a AMD Ryzen 7 PRO 6850U with Radeon Graphics and vulkaninfo gives me the following output https://gist.github.com/vincentbaeten/61ae0b4662e6d3ec0a13d18ded0d0bb9
Unfortunately today in actual ngscopeclient version AMD Ryzen Integrated GPU are not correctly supported for unknown reason.
What is recommended until it is fixed is to use an external GFX card which have the minimal specifications to run correctly so not something which is >10years old...
That is indeed unfortunate. Hopefully it will get fixed some day. If it is any help, this is the output with --debug
.
OMP_WAIT_POLICY not set to PASSIVE
Re-exec'ing with correct environment
Initializing Vulkan
VK_KHR_get_physical_device_properties2: supported
VK_KHR_xcb_surface: supported
VK_KHR_xlib_surface: supported
VK_EXT_debug_utils: supported
Loader/API support available for Vulkan 1.3
Vulkan 1.2 support available, requesting it
Initializing glfw 3.3.6 X11 GLX EGL OSMesa clock_gettime evdev shared
GLFW required extensions:
VK_KHR_surface
VK_KHR_xcb_surface
Physical devices:
Device 0: AMD Unknown (RADV REMBRANDT)
API version: 0x004030e0 (0.1.3.224)
Driver version: 0x05802005 (0.22.2.5)
Vendor ID: 1002
Device ID: 1681
Device type: Integrated GPU
int64: yes
int16: yes (allowed in SSBOs)
int8: yes (allowed in SSBOs)
Max image dim 2D: 16384
Max storage buf range: 4095 MB
Max mem alloc: 4095 MB
Max compute shared mem: 64 KB
Max compute grp count: 65535 x 65535 x 65535
Max compute invocs: 1024
Max compute grp size: 1024 x 1024 x 1024
Memory types:
Type 0
Heap index: 1
Device local
Type 1
Heap index: 1
Device local
Type 2
Heap index: 0
Host visible
Host coherent
Type 3
Heap index: 1
Device local
Host visible
Host coherent
Type 4
Heap index: 0
Host visible
Host coherent
Host cached
Type 5
Heap index: 1
Device local
Device coherent
Device uncached
Type 6
Heap index: 0
Host visible
Host coherent
Device coherent
Device uncached
Type 7
Heap index: 1
Device local
Host visible
Host coherent
Device coherent
Device uncached
Type 8
Heap index: 0
Host visible
Host coherent
Host cached
Device coherent
Device uncached
Memory heaps:
Heap 0
Size: 5 GB
Heap 1
Size: 10 GB
Device local
Device 1: llvmpipe (LLVM 15.0.7, 256 bits)
API version: 0x004030e0 (0.1.3.224)
Driver version: 0x00000001 (0.0.0.1)
Vendor ID: 10005
Device ID: 0000
Device type: CPU
int64: yes
int16: yes (allowed in SSBOs)
int8: yes (allowed in SSBOs)
Max image dim 2D: 16384
Max storage buf range: 128 MB
Max mem alloc: 4095 MB
Max compute shared mem: 32 KB
Max compute grp count: 65535 x 65535 x 65535
Max compute invocs: 1024
Max compute grp size: 1024 x 1024 x 1024
Memory types:
Type 0
Heap index: 0
Device local
Host visible
Host coherent
Host cached
Memory heaps:
Heap 0
Size: 2 GB
Device local
Selected device 0
Queue families (2 total)
Queue type 0
Queue count: 1
Timestamp valid bits: 64
Graphics
Compute
Transfer
Sparse binding
Queue type 1
Queue count: 4
Timestamp valid bits: 64
Compute
Transfer
Sparse binding
Driver: vk::DriverId::eMesaRadv
Enabling 64-bit integer support
Enabling 16-bit integer support
Enabling 16-bit integer support for SSBOs
Enabling 8-bit integer support
Enabling 8-bit integer support for SSBOs
Device has VK_KHR_shader_non_semantic_info, requesting it
Device has VK_EXT_memory_budget, requesting it
Using type 4 for pinned host memory
Using type 0 for card-local memory
Sorted queues:
Family=0 Index=0 Flags=0000000f
Family=1 Index=0 Flags=0000000e
Family=1 Index=1 Flags=0000000e
Family=1 Index=2 Flags=0000000e
Family=1 Index=3 Flags=0000000e
QueueManager creating family=0 index=0 name=g_vkTransferQueue
vkFFT version: 1.2.29
Detecting CPU features...
* AVX2
* FMA
QueueManager creating family=1 index=0 name=g_mainWindow.render
Using ImGui version 1.89.7 WIP
Vulkan driver is Mesa.
Disabling vkSetDebugUtilsObjectNameEXT on VkSurfaceKHR objects to work around driver bug.
QueueManager creating family=1 index=1 name=FilterGraphExecutor[2].queue
QueueManager creating family=1 index=2 name=FilterGraphExecutor[1].queue
QueueManager creating family=1 index=3 name=FilterGraphExecutor[4].queue
Unable to open recently used instruments file
Unable to open recently used files list (bad file)
radv/amdgpu: The CS has been cancelled because the context is lost.
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
Aborted (core dumped)
Notice that the g_mainWindow.render queue is selecting the wrong queue family without the graphics bits. This is likely the cause of rendering issues. I presume something is wrong with QueueManager::GetQueueWithFlags.
Indeed, if I replace the following (at https://github.com/glscopeclient/scopehal/blob/master/scopehal/QueueManager.h#L139):
{ return GetQueueWithFlags(vk::QueueFlagBits::eGraphics | vk::QueueFlagBits::eTransfer, name); }
with:
{ return GetQueueWithFlags(vk::QueueFlagBits::eGraphics, name); }
I can run ngscopeclient on a laptop with AMD integrated GPU (3750H + Vega 10) without the GPU driver crashing.
Interesting. The graphics queue should also have the transfer flag set (so you can move data around that you're drawing).
I'm pretty sure the QueueManager code was written by @lainy so maybe she can weigh in on if this is a bug there or somewhere else?
//Skip if flags don't match
if(!(m_queues[i].Flags & flags))
continue;
we can see that partially matching flags will not be skipped. If we were to check if the flags match exactly, then we can correctly select the queue.
//Skip if flags don't match
if((m_queues[i].Flags & flags) != flags)
continue;
Yeah that looks like a bug. @vincentbaeten @bvernoux can you check if the above change fixes the problem on your systems?
@azonenberg This does seem to solve it using the latest commit https://github.com/glscopeclient/scopehal-apps/commit/37aa92be2a73f916b1446caca705190fb898df2d and applying the above change. I haven't tested it extensively but I can add my scope and trigger it.
That's good enough for me. @hansemro plase send a PR and let's get this bug squished.
PR submitted: https://github.com/glscopeclient/scopehal/pull/790
I can run glscopeclient but ngscopeclient gives me a black screen and is unresponsive. I've tried it with Vulkan 1.3.250 and 1.3.239 which give me both the same output as given below using
export VK_LOADER_DEBUG=error,warn,info,layer,driver
.I've not copied the vulkan sdk files into my /usr like the manual says but I've set all the environment variables which I think is enough for testing? This is a bit out of my depth so I do not really know where to go from here.