charles-lunarg / vk-bootstrap

Vulkan Bootstrapping Iibrary
MIT License
821 stars 82 forks source link

Returning invalid VkPhysicalDevice handle #270

Closed chopndolphy closed 7 months ago

chopndolphy commented 7 months ago

Background info: I'm in the process of moving over a Vulkan project (stemming from vkguide-2) from Windows in Visual Studio to Linux with Make. Currently, I'm building vk-bootstrap with cmake and linking it with -L ./lib/vk-bootstrap/build -lvk-bootstrap. I am also linking -ldl and have tried turning on BUILD_SHARED_LIBS

vk-boostrap seems to picking my gpu correctly, as I can see the correct name and info upon inspection with gdb, but I'm pretty sure it's returning an invalid VkPhysicalDevice handle. The program segfaults when I manually call vkGetPhysicalDeviceProperties or when VMA does the same.

vkcube works and so does the vk-bootstrap example, so I'm pretty sure this either has something to do with how I'm building my project. Or maybe do I need to be using the dispatch table, like the example?

Thanks!

charles-lunarg commented 7 months ago

Can you post your setup code? It is really strange that you are getting invalid VkPhysicalDevices unless the way you 'get' the handles are wrong. They are just uint64_t values, even if they are typed like pointers.

chopndolphy commented 7 months ago
void VulkanEngine::init_vulkan()
{
    vkb::InstanceBuilder builder;

    auto inst_ret = builder
        .set_app_name("Vulkan Application")
        .request_validation_layers(bUseValidationLayers)
        .use_default_debug_messenger()
        .require_api_version(1, 3, 0)
        .build();
    if (!inst_ret) {
        std::cout << inst_ret.error().message() << "\n";
    }
    vkb::Instance vkb_inst = inst_ret.value();

    _instance = vkb_inst.instance;
    _debug_messenger = vkb_inst.debug_messenger;

    if (!SDL_Vulkan_CreateSurface(_window, _instance, &_surface)) {
        std::cout << "failed to create SDL_Vulkan surface" << std::endl;
    }

    VkPhysicalDeviceVulkan13Features features13{};
    features13.dynamicRendering = true;
    features13.synchronization2 = true;

    VkPhysicalDeviceVulkan12Features features12{};
    features12.bufferDeviceAddress = true;
    features12.descriptorIndexing = true;

    VkPhysicalDeviceFeatures features10{};
    features10.samplerAnisotropy = true;
    features10.sampleRateShading = true;

    vkb::PhysicalDeviceSelector selector{ vkb_inst };
    selector.set_minimum_version(1, 3);
    selector.set_required_features_13(features13);
    selector.set_required_features_12(features12);
    selector.set_required_features(features10);
    selector.set_surface(_surface);
    auto phys_device_ret = selector.select();
    if (!phys_device_ret) {
        std::cout << phys_device_ret.error().message() << "\n";
    }
    vkb::PhysicalDevice physicalDevice = phys_device_ret.value();

    vkb::DeviceBuilder deviceBuilder{ physicalDevice };
    auto device_ret = deviceBuilder.build();
    if (!device_ret) {
        std::cout << device_ret.error().message() << "\n";
    }

    vkb::Device vkbDevice = device_ret.value();

    _device = vkbDevice.device;
    _chosenGPU = physicalDevice.physical_device; 

    vkGetPhysicalDeviceProperties(_chosenGPU, &_gpuProperties);

    _graphicsQueue = vkbDevice.get_queue(vkb::QueueType::graphics).value();
    _graphicsQueueFamily = vkbDevice.get_queue_index(vkb::QueueType::graphics).value();

    VmaAllocatorCreateInfo allocatorInfo = {};
    allocatorInfo.physicalDevice = _chosenGPU;
    allocatorInfo.device = _device;
    allocatorInfo.instance = _instance;
    allocatorInfo.flags = VMA_ALLOCATOR_CREATE_BUFFER_DEVICE_ADDRESS_BIT;
    vmaCreateAllocator(&allocatorInfo, &_allocator); 
}
charles-lunarg commented 7 months ago
if (!phys_device_ret) {
    std::cout << phys_device_ret.error().message() << "\n";
}

So do you successfully get a GPU? or are you failing to select a GPU and continuing on? (In debug builds, .value() will assert if the function failed)

chopndolphy commented 7 months ago

I think I am successfully getting a gpu, though i did just notice that the vk_result value is 23.. which doesn't seem to be in the spec. Here is a screenshot of the return value output. The instance return vk_result was VK_SUCCESS, if that information is of any use. image

charles-lunarg commented 7 months ago

If the PhysicalDevice was selected successfully, then the m_error is garbage (because its union'd with the PhysicalDevice type.)

charles-lunarg commented 7 months ago

I'm asking all this because I'm unsure of how VkPhysicalDevice would be garbage. There are ways, but most of them aren't due to vk-bootstrap, and making sure the library & vulkan is being used correctly is the first thing I wanted to rule out.

What layers are present on your system? If you run vulkaninfo --summary you'll see the layers & physical devices. Maybe mesa-device-select is causing issues?

chopndolphy commented 7 months ago
Instance Layers: count = 12
---------------------------
VK_LAYER_INTEL_nullhw             INTEL NULL HW                                      1.1.73   version 1
VK_LAYER_KHRONOS_profiles         Khronos Profiles layer                             1.3.280  version 1
VK_LAYER_KHRONOS_shader_object    Khronos Shader object layer                        1.3.280  version 1
VK_LAYER_KHRONOS_synchronization2 Khronos Synchronization2 layer                     1.3.280  version 1
VK_LAYER_KHRONOS_validation       Khronos Validation Layer                           1.3.280  version 1
VK_LAYER_LUNARG_api_dump          LunarG API dump layer                              1.3.280  version 2
VK_LAYER_LUNARG_gfxreconstruct    GFXReconstruct Capture Layer Version 1.0.3-unknown 1.3.280  version 4194307
VK_LAYER_LUNARG_monitor           Execution Monitoring Layer                         1.3.280  version 1
VK_LAYER_LUNARG_screenshot        LunarG image capture layer                         1.3.280  version 1
VK_LAYER_MESA_device_select       Linux device selection layer                       1.3.211  version 1
VK_LAYER_MESA_overlay             Mesa Overlay layer                                 1.3.211  version 1
VK_LAYER_NV_optimus               NVIDIA Optimus layer                               1.3.277  version 1
charles-lunarg commented 7 months ago

Yeah VK_LAYER_MESA_device_select is there. Try setting the env-var NODEVICE_SELECT=1 and running you code again.

chopndolphy commented 7 months ago

Hmm still segfaulting at the same place. Is this normal without mesa device select?

[WARNING: General]
terminator_CreateInstance: Received return code -3 from call to vkCreateInstance in ICD /usr/lib/x86_64-linux-gnu/libvulkan_virtio.so. Skipping this driver.
chopndolphy commented 7 months ago

Also, running it with -D_GLIBCXX_DEBUG shows it crashing on the .select() with the error munmap_chunk(): invalid pointer

charles-lunarg commented 7 months ago

I haven't seen mesa device select causing these crashes, I just wanted to remove it as a source of issue.

The virtio driver warning is safe to ignore FYI.

Can you paste the full output of vulkaninfo --summary just so I know what system you are running on?

That sounds like the error is happening inside of select, unless I am misunderstanding what D_GLIBCXX_DEBUG does to vk-bootstrap.

chopndolphy commented 7 months ago
==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.280

Instance Extensions: count = 23
-------------------------------
VK_EXT_acquire_drm_display             : extension revision 1
VK_EXT_acquire_xlib_display            : extension revision 1
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_direct_mode_display             : extension revision 1
VK_EXT_display_surface_counter         : extension revision 1
VK_EXT_surface_maintenance1            : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 4
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_display                         : extension revision 23
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2         : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_surface_protected_capabilities  : extension revision 1
VK_KHR_wayland_surface                 : extension revision 6
VK_KHR_xcb_surface                     : extension revision 6
VK_KHR_xlib_surface                    : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 12
---------------------------
VK_LAYER_INTEL_nullhw             INTEL NULL HW                                      1.1.73   version 1
VK_LAYER_KHRONOS_profiles         Khronos Profiles layer                             1.3.280  version 1
VK_LAYER_KHRONOS_shader_object    Khronos Shader object layer                        1.3.280  version 1
VK_LAYER_KHRONOS_synchronization2 Khronos Synchronization2 layer                     1.3.280  version 1
VK_LAYER_KHRONOS_validation       Khronos Validation Layer                           1.3.280  version 1
VK_LAYER_LUNARG_api_dump          LunarG API dump layer                              1.3.280  version 2
VK_LAYER_LUNARG_gfxreconstruct    GFXReconstruct Capture Layer Version 1.0.3-unknown 1.3.280  version 4194307
VK_LAYER_LUNARG_monitor           Execution Monitoring Layer                         1.3.280  version 1
VK_LAYER_LUNARG_screenshot        LunarG image capture layer                         1.3.280  version 1
VK_LAYER_MESA_device_select       Linux device selection layer                       1.3.211  version 1
VK_LAYER_MESA_overlay             Mesa Overlay layer                                 1.3.211  version 1
VK_LAYER_NV_optimus               NVIDIA Optimus layer                               1.3.277  version 1

Devices:
========
GPU0:
    apiVersion         = 1.3.277
    driverVersion      = 550.67.0.0
    vendorID           = 0x10de
    deviceID           = 0x2684
    deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
    deviceName         = NVIDIA GeForce RTX 4090
    driverID           = DRIVER_ID_NVIDIA_PROPRIETARY
    driverName         = NVIDIA
    driverInfo         = 550.67
    conformanceVersion = 1.3.7.2
    deviceUUID         = b89cf4d9-c700-b7ab-84b9-65e71abda760
    driverUUID         = fe69d96c-763b-5724-837f-bf78daefd9e3
GPU1:
    apiVersion         = 1.3.255
    driverVersion      = 0.0.1
    vendorID           = 0x10005
    deviceID           = 0x0000
    deviceType         = PHYSICAL_DEVICE_TYPE_CPU
    deviceName         = llvmpipe (LLVM 15.0.7, 256 bits)
    driverID           = DRIVER_ID_MESA_LLVMPIPE
    driverName         = llvmpipe
    driverInfo         = Mesa 23.2.1-1ubuntu3.1~22.04.2 (LLVM 15.0.7)
    conformanceVersion = 1.3.1.1
    deviceUUID         = 6d657361-3233-2e32-2e31-2d3175627500
    driverUUID         = 6c6c766d-7069-7065-5555-494400000000
charles-lunarg commented 7 months ago

Looking at vulkan.gpuinfo.org reports for both devices, there is no reason they should fail to be created. So this is a very strange bug indeed.

chopndolphy commented 7 months ago

Hmm okay well thank you for the help. I'm going to set up all of the vulkan boilerplate manually to rule out anything outside of vk-bootstrap.

charles-lunarg commented 7 months ago

I'm happy to debug this more directly - reach me through the Vulkan discord server.

http://discord.gg/vulkan

charles-lunarg commented 7 months ago

So the ultimate cause of the problem was that vkGetPhysicalDeviceProperties was a nullptr - so of course it crashed. This was caused by volk being linked into the app but none of the setup calls for volk were called. volkInitialize() volkLoadInstance(vkb_inst.instance); and volkLoadDevice(vkbDevice.device); all need to be called to use volk.

charles-lunarg commented 7 months ago

Going to close this because it is not a vk-bootstrap issue.