PixarAnimationStudios / OpenUSD

Universal Scene Description
http://www.openusd.org
Other
6.11k stars 1.21k forks source link

Plugins don't work if Python is not initialized #1466

Open spitzak opened 3 years ago

spitzak commented 3 years ago

Description of Issue

The TF_REGISTRY_FUNCTION is not called if Py_Initialize() has not been done. In a related side-effect bug, Hf_PluginEntry::IncRefCount() crashes if there is any problem with the plugin.

Steps to Reproduce

Make a very simple renderDelegate plugin (ie one that does not call Python, there may be other fixes. The prman plugin does not trigger this bug). Make a simple "renderer" that just calls pxr::HdRendererPluginRegistry::GetInstance().GetRendererPlugin(rendererId).

Inserting Py_Initialize() before this makes the program work.

System Information (OS, Hardware)

Linux

Package Versions

20.11

spitzak commented 3 years ago

Explanation: I put a break in the TF_REGISTRY_FUNCTION in the working version and got the following stack dump:

#0  _INTERNALb219510b::pxrInternal_v0_20__pxrReserved__::_Tf_RegistryFunction40 () at src/RendererPlugin.cc:42
#1  0x00007fffea5d2360 in pxrInternal_v0_20__pxrReserved__::(anonymous namespace)::Tf_RegistryManagerImpl::_RunRegistrationFunctionsNoLock (this=0x6ec470, typeName="TfType")
    at pxr/base/tf/registryManager.cpp:501
#2  0x00007fffea5d1fe0 in pxrInternal_v0_20__pxrReserved__::(anonymous namespace)::Tf_RegistryManagerImpl::_UpdateSubscribersNoLock (this=0x6ec470) at pxr/base/tf/registryManager.cpp:440
#3  0x00007fffea5d1f69 in pxrInternal_v0_20__pxrReserved__::(anonymous namespace)::Tf_RegistryManagerImpl::_ProcessLibraryNoLock (this=0x6ec470) at pxr/base/tf/registryManager.cpp:431
#4  0x00007fffea5d1d52 in pxrInternal_v0_20__pxrReserved__::(anonymous namespace)::Tf_RegistryManagerImpl::SubscribeTo (this=0x6ec470, typeName="TfScriptModuleLoader") at pxr/base/tf/registryManager.cpp:396
#5  0x00007fffea5d289f in pxrInternal_v0_20__pxrReserved__::TfRegistryManager::_SubscribeTo (this=0x7fffea976630 <pxrInternal_v0_20__pxrReserved__::TfRegistryManager::GetInstance()::manager>, ti=...)
    at pxr/base/tf/registryManager.cpp:592
#6  0x00007fffea65a91e in pxrInternal_v0_20__pxrReserved__::TfRegistryManager::SubscribeTo<pxrInternal_v0_20__pxrReserved__::TfScriptModuleLoader> (
    this=0x7fffea976630 <pxrInternal_v0_20__pxrReserved__::TfRegistryManager::GetInstance()::manager>)
    at /hosts/pearlrewind/usr/pic1/rez/usd_core/0.20.11.x.2.2.0.9999/refplat-vfx2019.2/include/pxr/base/tf/registryManager.h:69
#7  0x00007fffea658efb in pxrInternal_v0_20__pxrReserved__::TfScriptModuleLoader::_LoadModulesFor (this=0x2107a30, inName=...) at pxr/base/tf/scriptModuleLoader.cpp:343
#8  0x00007fffea6586d4 in pxrInternal_v0_20__pxrReserved__::TfScriptModuleLoader::LoadModules (this=0x2107a30) at pxr/base/tf/scriptModuleLoader.cpp:195
#9  0x00007fffea579de3 in pxrInternal_v0_20__pxrReserved__::TfDlopen (filename="/usr/home/bspitzak/rez/hdMoonray/0.20.17.0/refplat-vfx2019.2/usd_imaging-0.20.11.x.2.3/opt_level-debug/plugin/hdMoonray.so", 
    flag=2, error=0x7ffffffeb960, loadScriptBindings=true) at pxr/base/tf/dl.cpp:93
#10 0x00007fffeca9b245 in pxrInternal_v0_20__pxrReserved__::PlugPlugin::_Load (this=0x1fef550) at pxr/base/plug/plugin.cpp:254
#11 0x00007fffeca9b9fe in pxrInternal_v0_20__pxrReserved__::PlugPlugin::_LoadWithDependents (this=0x1fef550, seenPlugins=0x7ffffffebd70) at pxr/base/plug/plugin.cpp:332
#12 0x00007fffeca9bb05 in pxrInternal_v0_20__pxrReserved__::PlugPlugin::Load (this=0x1fef550) at pxr/base/plug/plugin.cpp:354
#13 0x00007fffefe8b026 in pxrInternal_v0_20__pxrReserved__::HfPluginRegistry::GetPlugin (this=<optimized out>, pluginId=...) at pxr/imaging/hf/pluginRegistry.cpp:152

If Python is not initialized, the function pxr::TfScriptModuleLoader::_LoadModulesFor (frame 7) returns immediately, and never calls Tf_RegistryManagerImpl::_UpdateSubscribersNoLock. I suspect the actual fix is to add another call to that function though I don't know where that should go.

In addition if the registry function is not called, Hf_PluginEntry::IncRefCount() gets NULL from _type.GetFactory<_Factory>() and then crashes calling factory->New(). It should at least print a more informative error message.

jilliene commented 3 years ago

Filed as internal issue #USD-6589

spitzak commented 3 years ago

This may be difficult to trigger, it appears the bug depends on side-effects of compilation, including which vfx reference platform is being used. This may have to do with initialization order causing python to be initialized anyway, or other calls to SubscribeTo.

However searching the code, it does appear that registration functions of newly loaded plugins are only called as a side-effect of calling SubscribeTo. I suspect you intended to call already-subscribed ones immediately after the dlopen in the plugin loader.