get_dedicated_queue_index excludes VK_QUEUE_TRANSFER_BIT

DeltaW0x commented 1 week ago

When searching for a dedicated compute queue, detail::get_dedicated_queue_index will always try to find a queue without VK_QUEUE_TRANSFER_BIT. This is against what the Vulkan specifications say about VkQueueFlagBits:

All commands that are allowed on a queue that supports transfer operations are also allowed on a queue that supports either graphics or compute operations. Thus, if the capabilities of a queue family include VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT, then reporting the VK_QUEUE_TRANSFER_BIT capability separately for that queue family is optional.

Because all compute queues are transfer queues too, get_dedicated_queue_index will indeed always fail to find a dedicated compute queue

charles-lunarg commented 1 week ago

The point of the 'get dedicated queue' function is to try and find a queue (family) that supports the desired operations and is distinct from the other queues. But therein lies a conundrum - to get a dedicated compute queue is impossible as you point out. The best I can do is find a queue that reports "only" compute operations and nothing else, even if transfer ops are required to be supported for that queue.

The problem with the 'get dedicated queue' logic is that not all hardware has 'dedicated' queues. The variability of queue (families) means its practically impossible to get the "right" queue in all situations. Plus, the whole point of queue families is to abstract the actual execution hardware available. Because each vendor uses a different strategy, it is difficult to come up with a single interface that works on all GPUs.

For example, I wanted there to be a "graphics queue", "compute queue", and "transfer queue" since that is broadly what people need. But whenever there is only a single queue family, it would be bad for users to think they have separate queues when they actually have the same one. This would cause unexpected stalls and deadlocks when submitting complex workloads.

DeltaW0x commented 3 days ago

Sorry for answering so late but I had issues. I see, I opened this issues because vulkaninfo does report a "dedicated" compute queue for my gpu, but the library doesn't seem to select it for some reason instead it selects the one of the more generic ones:

VkQueueFamilyProperties:
========================
        queueProperties[0]:
        -------------------
                minImageTransferGranularity = (1,1,1)
                queueCount                  = 1
                queueFlags                  = QUEUE_GRAPHICS_BIT | QUEUE_COMPUTE_BIT | QUEUE_TRANSFER_BIT | QUEUE_SPARSE_BINDING_BIT
                timestampValidBits          = 64
                present support             = true

        queueProperties[1]:
        -------------------
                minImageTransferGranularity = (1,1,1)
                queueCount                  = 2
                queueFlags                  = QUEUE_COMPUTE_BIT | QUEUE_TRANSFER_BIT | QUEUE_SPARSE_BINDING_BIT
                timestampValidBits          = 64
                present support             = true

        queueProperties[2]:
        -------------------
                minImageTransferGranularity = (16,16,8)
                queueCount                  = 2
                queueFlags                  = QUEUE_TRANSFER_BIT | QUEUE_SPARSE_BINDING_BIT
                timestampValidBits          = 64
                present support             = true

Here vk-bootstrap keeps selecting the first queue as dedicated compute, while technically the second one would be better? maybe I'm wrong, I'm still trying to wrap my head around this

charles-lunarg / vk-bootstrap

get_dedicated_queue_index excludes VK_QUEUE_TRANSFER_BIT #321