baldurk / sample_layer

A Sample vulkan layer
BSD 2-Clause "Simplified" License
45 stars 10 forks source link

Global_lock can be a shared_mutex #1

Open ratchetfreak opened 7 years ago

ratchetfreak commented 7 years ago

Because vulkan requires that mutable objects are usually externally synchronized you can use a shared lock around getting the custom data. It helps that std::map maintains referential consistency when inserting/removing.

So cmdDraw can boil down to:

VK_LAYER_EXPORT void VKAPI_CALL SampleLayer_CmdDraw(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    vertexCount,
    uint32_t                                    instanceCount,
    uint32_t                                    firstVertex,
    uint32_t                                    firstInstance)
{

  CommandStats* state;
  {
    scoped_lock_shared l(global_lock);
    state = &commandbuffer_stats[commandBuffer];
  }

  state->drawCount++;
  state->instanceCount += instanceCount;
  state->vertCount += instanceCount*vertexCount;

  device_dispatch[GetKey(commandBuffer)].CmdDraw(commandBuffer, vertexCount, instanceCount, firstVertex, firstInstance);
}

This will improve threading behavior and stay correct as long as the application obeys the threading requirements.

baldurk commented 7 years ago

You're right that we don't have to worry about concurrent access to the same command buffer on different threads, as you say Vulkan guarantees that applications can't be using the same command buffer on multiple threads.

But because there's only a single global lock for all maps, the lock also needs to protect the lookup to device_dispatch - since in theory someone could create a new device and modify the map while we're drawing in one command buffer on one thread.

We could do as you point out and lock around just the access to the CommandStats and then lock separately around the dispatch lookup. At that point all we're saving is a couple of additions outside the lock before we re-acquire it.

You could also only lock around the actual dispatch lookup and then call CmdDraw while not holding the lock, since we know the dispatch table is invariant as well. The same applies to all the other functions where we call through the dispatch table.

In the end though this code is just intended as an example and optimal locking is not worth complicating the code. In a real-world example you'd either have separate locks, or as I mentioned in the article I strongly favour not having any locks at all in these functions and using object wrapping to store data.

ratchetfreak commented 7 years ago

Oh I missed that device_dispatch should also be protected. However you can do the query before modifying the struct:

VK_LAYER_EXPORT void VKAPI_CALL SampleLayer_CmdDraw(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    vertexCount,
    uint32_t                                    instanceCount,
    uint32_t                                    firstVertex,
    uint32_t                                    firstInstance)
{

  CommandStats* state;
  VkLayerInstanceDispatchTable* dispatch;
  {
    scoped_lock_shared l(global_lock);
    state = &commandbuffer_stats[commandBuffer];
    dispatch = &device_dispatch[GetKey(commandBuffer)];
  }

  state->drawCount++;
  state->instanceCount += instanceCount;
  state->vertCount += instanceCount*vertexCount;

  dispatch->CmdDraw(commandBuffer, vertexCount, instanceCount, firstVertex, firstInstance);
}

The other option is to add a pointer to the dispatch table to the stats struct. So there is only 1 lookup to be done.

Either way having a shared lock (aka a read-write lock) will help contention. Because adding something to the map will be much rarer than the lookup.

For objects that aren't externally synced you can then use a per-object mutex that you look up while under the shared lock.