KhronosGroup / Vulkan-Loader

Vulkan Loader
https://vulkan.lunarg.com/doc/sdk/latest/linux/LoaderInterfaceArchitecture.html
Other
506 stars 277 forks source link

SIGSEGV in vkEnumeratePhysicalDevices #1134

Closed bl4ckb0ne closed 1 year ago

bl4ckb0ne commented 1 year ago

Describe the bug I'm encountering a SIGSEGV when calling vkEnumeratePhysicalDevices in renderdoc 1.25 1

Environment (please complete the following information):

To Reproduce Steps to reproduce the behavior:

  1. Compile or get Renderdoc (reproduced on both 1.22 and 1.25)
  2. Launch qrenderdoc

VK_LOADER_DEBUG output

vk_loader_debug.log

Stacktrace

(gdb) bt
#0  0x00007fffe3c7da1c in linux_read_sorted_physical_devices (inst=inst@entry=0x7fffe1eb4560, icd_count=icd_count@entry=4, icd_devices=icd_devices@entry=0x7fffe3feaf40, 
    phys_dev_count=phys_dev_count@entry=4, sorted_device_term=sorted_device_term@entry=0x7fffe0c8e970)
    at /home/simon/src/aports/main/vulkan-loader/src/Vulkan-Loader-1.3.240/loader/loader_linux.c:282
#1  0x00007fffe3c71202 in setup_loader_term_phys_devs (inst=inst@entry=0x7fffe1eb4560)
    at /home/simon/src/aports/main/vulkan-loader/src/Vulkan-Loader-1.3.240/loader/loader.c:6206
#2  0x00007fffe3c71358 in terminator_EnumeratePhysicalDevices (instance=0x7fffe1eb4560, pPhysicalDeviceCount=0x7fffe3feb100, pPhysicalDevices=0x0)
    at /home/simon/src/aports/main/vulkan-loader/src/Vulkan-Loader-1.3.240/loader/loader.c:6362
#3  0x00007fffe3c752b8 in vkEnumeratePhysicalDevices (instance=<optimized out>, pPhysicalDeviceCount=0x7fffe3feb100, pPhysicalDevices=0x0)
    at /home/simon/src/aports/main/vulkan-loader/src/Vulkan-Loader-1.3.240/loader/trampoline.c:731
#4  0x00007ffff32f50ee in WrappedVulkan::Initialise (this=0x7fffe3c2f430, params=..., sectionVersion=21, opts=...)
    at /tmp/renderdoc-1.25/renderdoc/driver/vulkan/wrappers/vk_device_funcs.cpp:518
#5  0x00007ffff2e4914a in Vulkan_CreateReplayDevice (rdc=0x0, opts=..., driver=0x7fffe3feb6c0) at /tmp/renderdoc-1.25/renderdoc/driver/vulkan/vk_replay.cpp:4692
#6  0x00007ffff37d68bb in operator() (__closure=0x7fffe3feb930) at /tmp/renderdoc-1.25/renderdoc/core/core.cpp:644
#7  0x00007ffff37e0e7e in std::__invoke_impl<void, RenderDoc::InitialiseReplay(GlobalEnvironment, const rdcarray<rdcstr>&)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/12.2.1/bits/invoke.h:61
#8  0x00007ffff37e0843 in std::__invoke_r<void, RenderDoc::InitialiseReplay(GlobalEnvironment, const rdcarray<rdcstr>&)::<lambda()>&>(struct {...} &) (__fn=...)
    at /usr/include/c++/12.2.1/bits/invoke.h:154
#9  0x00007ffff37e0347 in std::_Function_handler<void(), RenderDoc::InitialiseReplay(GlobalEnvironment, const rdcarray<rdcstr>&)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/12.2.1/bits/std_function.h:290
#10 0x00007ffff2cb2a8c in std::function<void ()>::operator()() const (this=0x7fffe3feb930) at /usr/include/c++/12.2.1/bits/std_function.h:591
#11 0x00007ffff3e0b3ef in Threading::sThreadInit (init=0x7fffe45b1da0) at /tmp/renderdoc-1.25/renderdoc/os/posix/posix_threading.cpp:173
#12 0x00007ffff7fba08b in start (p=0x7fffe3feb9a8) at src/thread/pthread_create.c:203
#13 0x00007ffff7fbc38e in __clone () at src/thread/x86_64/clone.s:22
charles-lunarg commented 1 year ago

This is crashing on a call to one of the drivers vkEnumerateDeviceExtensionProperties. I have a strong suspicion that this is from the Intel driver but have nothing to confirm it.

Running renderdoc locally does not produce any crash of the sorts for me.

Can you manually select each driver one by one to see if only one of them is provoking a crash? You can use the VK_LOADER_DRIVERS_SELECT environment variable to do so.

For instance, based on your log file these would be the 4 values to set the env-var to:

VK_LOADER_DRIVERS_SELECT=lvp_icd.x86_64.json
VK_LOADER_DRIVERS_SELECT=intel_hasvk_icd.x86_64.json
VK_LOADER_DRIVERS_SELECT=intel_icd.x86_64.json
VK_LOADER_DRIVERS_SELECT=radeon_icd.x86_64.json

Another possible source of error is the fact that there are a few implicit layers on your system while the stack trace shows no implicit layers intercepting the vkEnumeratePhysicalDevices calls. VkLayer_MESA_device_select is a possible candidate for issues, I would do a similar check with VK_LOADER_LAYERS_DISABLE=VkLayer_MESA_device_select to confirm that isn't the issue.

bl4ckb0ne commented 1 year ago

Seems to be working with VK_LOADER_DRIVERS_SELECT=radeon_icd.x86_64.json and VK_LOADER_DRIVERS_SELECT=lvp_icd.x86_64.json, I'll report the issue to Mesa