Closed thewtex closed 2 years ago
I guess we should do this for 5.3. Tom, is this on your radar?
Yes, actively investigating this one. itk::ObjectFactoryBase
seems to make use of the global SingletonIndex
which is already synchronized across ITK Python modules, I'm trying to pin down why this does not translate to synchronized factory lists.
Note that it looks like a66337ec was intended to address factory synchronization across Python modules, but in practice we are not currently seeing successful synchronization.
Some other investigation notes so far:
SingletonIndex::GetInstance()
returns the same address when called within / stepping through ITKCommon, ITKFFT, or ITKConvolution class constructors, which verifies that SingletonIndex
synchronization in <modulename>_ext.i
files with the Python C API is working as expectedGetPimplGlobalsPointer()
returns different addresses for m_PimplGlobals
when called within itk::ObjectFactoryBase
within different modules, indicating that the global value is not being properly synchronizedRegisterOneFactory()
which calls ObjectFactoryBase::RegisterFactoryInternal()
adds the factories to ObjectFactoryBase::m_PimplGlobals->m_InternalFactories
across modules. When itk::ObjectFactoryBase::Initialize
is called, the internal factory list is used to populate ObjectFactoryBase::m_PimplGlobals->m_RegisteredFactories
ObjectFactoryBase::RegisterInternalFactoryOnce
(see https://github.com/InsightSoftwareConsortium/ITK/blob/master/Modules/IO/MeshOBJ/src/itkOBJMeshIOFactory.cxx#L54 for an example) they do not seem to appear in ObjectFactoryBase::m_PimplGlobals->m_InternalFactories
in other Python modules and thus are never added to the list of registered factories.Still actively investigating this one. I originally was having trouble figuring out why SingletonIndex->m_GlobalObjects
was stored at different addresses, but after stepping through with Visual Studio as well as GDB my current hypothesis is that SingletonIndex
instances are initialized for each module before synchronization with ITKCommon -- that is, each module still has a fully initialized SingletonIndex
instance where factories are registered, but these instances are masked and never referenced. This would resolve earlier questions of why m_PimplGlobals
has different addresses for initial factory registrations, but requires further digging into what is referenced with CreateInstance
.
Adding some discussion for reproducibility, my debugging workflow for tracing through code makes use of procedures recently added to the ITK Software Guide:
RelWithDebInfo
configuration to get debug symbolssource/ITK/Wrapping/Generators/Python/itk/ITKFFTPython.py
:
...
import pdb
if package or "." in name: from . import _ITKFFTPython else: import _ITKFFTPython
pdb.set_trace() ...
- Activate the Python virtual environment (if relevant)
- In a Python session, get the process ID and then import ITK. The session will stop on the PDB breakpoint after loading ITKFFT but before modules depending on ITKFFT load their factories.
```py
> import os
> os.getpid()
99999
> import itk
> itk.force_load()
...
(Pdb)
itkSingleton.cpp
for examining SingletonIndex
values at runtime, then let the program continue.
> gdb -p 99999
(gdb) break source/ITK/Modules/Core/Common/src/itkSingleton.cxx:108
(gdb) c
(Pdb) c
(gdb) display m_Instance
(gdb) p m_GlobalObjects
(gdb) bt
(gdb) c
After spending a decent amount of time debugging I have a few more thoughts on what is happening under the hood and a potential resolution. At runtime when we import ITK Python modules (either lazily or with itk.force_load()
), the following steps occur:
ITKCommon
are loaded. ITKCommon
does not define any factories, so its object factory structures (a ObjectFactoryBasePrivate
with m_RegisteredFactories
and m_InitializedFactories
) are not initialized.ITKConvolution
are loaded. For each module,
SingletonIndex->GlobalObjects
map has not yet been synchronized, so a new ObjectFactoryBasePrivate
object is added to the module's map under the "ObjectFactoryBase" entry. pdb
and gdb
breakpoints can be used together to watch as m_InitializedFactories
is allocated and then iteratively filled with each subsequent registration. This structure is cached as the private member m_PimplGlobals
within the itk.ObjectFactoryBase
"singleton" for the moduleSingletonIndex
reference for the given module to the SingletonIndex
in the ITKCommon module. However, ObjectFactoryBase->m_PimplGlobals
remains in the cache for each module.After all modules have been imported the following behaviors all occur:
itk.ObjectFactoryBase.GetRegisteredFactories()
queries the m_PimplGlobals
in ITKCommon (confirmed with pdb
backtrace) and gets 0 factories, because no factories were ever synchronized with ITKCommon's ObjectFactoryBasePrivate
instance.itk.FFTPadImageFilter.values()[-1].New()
succeeds because factories are successfully registered in their own modules.Caching behavior can be removed such that m_PimplGlobals
is always dynamically retrieved from ITKCommon's SingletonIndex
global map. However, since factories are never registered directly against ITKCommon's object factory any attempt at instantiating through the factory fails unless factories are manually registered each time.
The ideal resolution would be to synchronize SingletonIndex
references, then run default factory registrations. However, I'm not sure that this is possible as our synchronization method through the C API relies on both SingletonIndex
instances already existing, which can only occur after the module has been imported, which causes factories to be registered.
The best solution available to us here is likely to add a synchronization step through the C API where the list of internal/registered factories is compared across the ITKCommon and ITKObjectFactoryBasePrivate
instances and any new factories are added manually to ITKCommon.
I'm inclined to believe that the core of this issue lies in planned but possibly unimplemented functionality. a66337e claims to implement module synchronization:
The synchronization function will be called to synchronize these global variables across the different modules.
This is implemented as a requirement that each singleton map member is added with a synchronization function and a destructor function in a tuple, as in itkSingleton.cxxL102
. The member and destructors are referenced in the commit with ::get<0>()
and ::get<2>()
, but the synchronization function never appears with a ::get<1>()
anywhere. Setting breakpoints in the object factory SynchronizeObjectFactoryBase()
function shows that it is never called. It is easy to prove that factories are not synchronized across platforms by querying the ITKCommon factory list:
> import itk
> itk.force_load() # Register default IO and FFT factories
> itk.ObjectFactoryBase.GetRegisteredFactories() # Shows all factories in ITKCommon's factory list
() # No factories in list
I am testing a fix for object factory sync using the existing SynchronizeObjectFactoryBase
method. However, for proper synchronization across objects other than itk.ObjectFactoryBase
it may be necessary to figure out how exactly the synchronization methods included in the SingletonIndex->GlobalObjects
map were intended to be employed.
Similar to the global, cross-binary singleton for the global timestamp, use a pointer for the object factory that is synchronized across Python module binaries on initialization.
After implementation, attempt a revert of https://github.com/InsightSoftwareConsortium/ITK/pull/2864/commits/35df104b3eaf2fbd7c765d1e786db804bdf5fbbd