We are experiencing hangs shutting down a process that uses the OpenUSD SDK (on Windows). We have a process that recursively exports from Maya to USD using the Maya-USD plugin and our translators. This process spawns worker processes to do the exports in parallel, so we have many instances of using the SDK, and many chances to experience these hangs. They all seem to come down to global destructors calling into TBB and TBB is in an undefined state because the process is exiting. A couple of examples:
The destructor for TfToken calls _RemoveRef, which calls _PossiblyDestroyRep, which calls Tf_TokenRegistry::_PossiblyDestroyRep, which tries to lock a TBB mutex. So all instances of global TfTokens, including ones in global sets, maps, arrays, etc., can cause this problem. Fixing every instance of global TfTokens wasn't tenable, so I added a boolean to exit early out of _PossiblyDestroyRep when the boolean is set so we can control when TfTokens are removed from the registry.
The UsdShadeShaderDefParserPlugin has a static member variable for a stage cache. When the global destructor of the stage cache runs, it closes a stage, and the UsdStage::_Close function does a TBB parallel for-loop. To work around this problem, I changed the member variable into a pointer and allocate the stage cache from the heap and never free it so the destructor doesn't run.
I'm sure my "fixes" are not great and I'm not suggesting these be implemented in OpenUSD. I'm hoping there is a better systemic fix with global destructors accessing TBB.
We are experiencing hangs shutting down a process that uses the OpenUSD SDK (on Windows). We have a process that recursively exports from Maya to USD using the Maya-USD plugin and our translators. This process spawns worker processes to do the exports in parallel, so we have many instances of using the SDK, and many chances to experience these hangs. They all seem to come down to global destructors calling into TBB and TBB is in an undefined state because the process is exiting. A couple of examples:
The destructor for
TfToken
calls_RemoveRef
, which calls_PossiblyDestroyRep
, which callsTf_TokenRegistry::_PossiblyDestroyRep
, which tries to lock a TBB mutex. So all instances of globalTfTokens
, including ones in global sets, maps, arrays, etc., can cause this problem. Fixing every instance of globalTfTokens
wasn't tenable, so I added a boolean to exit early out of_PossiblyDestroyRep
when the boolean is set so we can control whenTfTokens
are removed from the registry.The
UsdShadeShaderDefParserPlugin
has a static member variable for a stage cache. When the global destructor of the stage cache runs, it closes a stage, and theUsdStage::_Close
function does a TBB parallel for-loop. To work around this problem, I changed the member variable into a pointer and allocate the stage cache from the heap and never free it so the destructor doesn't run.I'm sure my "fixes" are not great and I'm not suggesting these be implemented in OpenUSD. I'm hoping there is a better systemic fix with global destructors accessing TBB.
Thanks!
Package Versions
23.05