KhronosGroup / Vulkan-LoaderAndValidationLayers

**Deprecated repository** for Vulkan loader and validation layers
Apache License 2.0
414 stars 172 forks source link

VK_EXT_debug_utils implementation has threading bug(s) #2672

Closed danginsburg closed 6 years ago

danginsburg commented 6 years ago

Using VK_EXT_debug_utils with dota I am frequently getting stuck in an infinite loop inside of vkCmdBeginDebugUtilsLabelEXT and vkCmdEndDebugUtilsLabelEXT. It is intermittent, and when it happens, the call stack looks something like this:

    VkLayer_core_validation.dll!std::_Hash<std::_Umap_traits<VkCommandBuffer_T * __ptr64,std::vector<_LoggingLabelData,std::allocator<_LoggingLabelData> >,std::_Uhash_compare<VkCommandBuffer_T * __ptr64,std::hash<VkCommandBuffer_T * __ptr64>,std::equal_to<VkCommandBuffer_T * __ptr64> >,std::allocator<std::pair<VkCommandBuffer_T * __ptr64 const,std::vector<_LoggingLabelData,std::allocator<_LoggingLabelData> > > >,0> >::lower_bound(VkCommandBuffer_T * const & _Keyval) Line 596 C++
    VkLayer_core_validation.dll!core_validation::CmdEndDebugUtilsLabelEXT(VkCommandBuffer_T * commandBuffer) Line 11965 C++
    VkLayer_object_tracker.dll!object_tracker::CmdEndDebugUtilsLabelEXT(VkCommandBuffer_T * commandBuffer) Line 728 C++
>   VkLayer_parameter_validation.dll!parameter_validation::vkCmdEndDebugUtilsLabelEXT(VkCommandBuffer_T * commandBuffer) Line 10943 C++
    VkLayer_threading.dll!threading::CmdEndDebugUtilsLabelEXT(VkCommandBuffer_T * commandBuffer) Line 5681  C++
    rendersystemvulkan.dll!CRenderContextVulkan::DetachCommandList() Line 1493  C++
    rendersystemvulkan.dll!CRenderContextBase::Submit() Line 662    C++
    rendersystemvulkan.dll!CRenderContextPtr::Release() Line 843    C++
    rendersystemvulkan.dll!CRenderContextPtr::~CRenderContextPtr() Line 836 C++
    [External Code] 
    rendersystemvulkan.dll!CTextureManagerVulkan::ComputeTextureObject(TextureObjectInfo_t * pInfo, const TextureSpecification_t * pResourceSpec, const void * pTextureBitsData, int nTextureBitsSize, bool bImmutable) Line 1384   C++
    rendersystemvulkan.dll!CTextureManagerVulkan::GenerateTextureObject(TextureObjectInfo_t * pInfo, CTextureBase * pTextureBase, const TextureSpecification_t & texSpec, const void * pTextureBits, int nDataSize, TextureBitsMemoryType_t nMemory) Line 1397  C++
    rendersystemvulkan.dll!CTextureManagerBase::HookUpTextureBits(CWeakHandle<InfoForResourceTypeCTextureBase> hTexture, void * pData, int nDataSize, TextureSpecification_t texSpec, int nStreamingRequestId, TextureOnDiskCompressionType_t compressionType) Line 1238    C++
    rendersystemvulkan.dll!CMemberFuncProxy6<CTextureManagerBase * __ptr64,void (__cdecl CTextureManagerBase::*)(CWeakHandle<InfoForResourceTypeCTextureBase>,void * __ptr64,int,TextureSpecification_t,int,enum TextureOnDiskCompressionType_t) __ptr64,CWeakHandle<InfoForResourceTypeCTextureBase>,void * __ptr64,int,TextureSpecification_t,int,enum TextureOnDiskCompressionType_t,CFuncMemPolicyNone>::operator()(const CWeakHandle<InfoForResourceTypeCTextureBase> & arg1, void * const & arg2, const int & arg3, const TextureSpecification_t & arg4, const int & arg5, const TextureOnDiskCompressionType_t & arg6) Line 519  C++
    rendersystemvulkan.dll!CMemberFunctor6<CTextureManagerBase * __ptr64,void (__cdecl CTextureManagerBase::*)(CWeakHandle<InfoForResourceTypeCTextureBase>,void * __ptr64,int,TextureSpecification_t,int,enum TextureOnDiskCompressionType_t) __ptr64,CWeakHandle<InfoForResourceTypeCTextureBase>,void * __ptr64,int,TextureSpecification_t,int,enum TextureOnDiskCompressionType_t,CRefCounted1<CFunctor,CRefCountServiceBase<1,CRefMT> >,CFuncMemPolicyNone>::operator()() Line 569 C++
    rendersystemvulkan.dll!CFunctorJob::DoExecute() Line 669    C++
    vstdlib.dll!0000000003eb08ad()  Unknown
    vstdlib.dll!0000000003eb153d()  Unknown
    tier0.dll!000000000410f64e()    Unknown
    tier0.dll!000000000410f8cb()    Unknown
    tier0.dll!000000000410f810()    Unknown
    [External Code] 

In object_tracker.cpp, the global lock is released before calling EndCmdDebugUtilsLabel. That function does find/insert into report_data->debugUtilsCmdBufLabels. That set is hung off of the dev_data so is not threadsafe to access without holding a lock.

I think that object_tracker.cpp has similar issues in CmdBeginDebugUtilsLabelEXT, QueueInsertDebugUtilsLabelEXT, QueueInsertDebugUtilsLabelEXT, etc. It looks to me like any of these could be accessing global data without a lock.

I think the set is being corrupted due to simultaneous edits on multiple threads. I don't have an easy repro case.

karl-lunarg commented 6 years ago

Closing because migrated issue KhronosGroup/Vulkan-ValidationLayers#103 has been closed.