baldurk / sample_layer

A Sample vulkan layer
BSD 2-Clause "Simplified" License
45 stars 10 forks source link

Segmentation fault in SampleLayer_CreateDevice #4

Open hysw opened 2 years ago

hysw commented 2 years ago

I'm creating this issue since I can't find the source for https://renderdoc.org/vulkan-layer-guide.html and make a pull request.

In layer's CreateDevice function

92  VK_LAYER_EXPORT VkResult VKAPI_CALL SampleLayer_CreateDevice(...)
97  {
      ...
118   PFN_vkCreateDevice createFunc = (PFN_vkCreateDevice)gipa(VK_NULL_HANDLE, "vkCreateDevice");
119
120   VkResult ret = createFunc(physicalDevice, pCreateInfo, pAllocator, pDevice);

Here pfnNextGetInstanceProcAddr is called with VK_NULL_HANDLE and "vkCreateDevice" which the return is undefined according to the Vulkan spec. Most implementations happens to return a valid vkCreateDevice pointer. However renderdoc layer returns nullptr in this case and we got a segfault.

A correct implementation should get the instance from physical device and pass that to pfnNextGetInstanceProcAddr.

$ ENABLE_SAMPLE_LAYER=1 ENABLE_VULKAN_RENDERDOC_CAPTURE=1 vkcube

...

Thread 1 "vkcube" received signal SIGSEGV, Segmentation fault.

#0  ?? ()
#1  SampleLayer_CreateDevice (physicalDevice=0x5555555a67e0, 
    pCreateInfo=0x7fffffffc1c0, pAllocator=0x0, pDevice=0x7fffffffc178) at sample_layer.cpp:120
120       VkResult ret = createFunc(physicalDevice, pCreateInfo, pAllocator, pDevice);
...

(gdb) print createFunc
$1 = (PFN_vkCreateDevice) 0x0
charles-lunarg commented 1 year ago

I can confirm that the query of vkCreateDevice with an invalid instance is incorrect vulkan behavior, however due to the loader having a bug which made that work, it is now a known "bug" in the loader.

PR that documents this behavior (and re-introduces it cause I tried to fix it last week but got pushback as it breaks a bunch of users systems) https://github.com/KhronosGroup/Vulkan-Loader/pull/1183

gregschlom commented 1 year ago

I got hit by this as well, on Android (without RenderDoc).

However the issue is more subtle on Android as (PFN_vkCreateDevice)gipa(VK_NULL_HANDLE, "vkCreateDevice"); returns a non-null function pointer, and everything seems to work fine except the layer behaved incorrectly and many of the intercepted functions didn't get called back by the layer.

EDIT: That wasn't actually the real issue. The real issue was that on Android vkGetDeviceProcAddr() will be called by the loader to resolve the address of itself, and in that case I was forwarding the call to the next layer instead of intercepting it.