Closed y-okumura-isp closed 3 years ago
Also, it's been many years but I'll ask. @mirzashah do you recall any of this?
Is it better to update PR?
Is it better to update PR?
Please do. Also, check the linter issues reported in the Rpr job.
Hmm, it looks more complicated than I initially thought.
I tried to use a unique pointer and found the following.
Pointers in FatoryMap are not deleted directly but inserted into graveyard
and finally purged.
https://github.com/ros/class_loader/blob/melodic-devel/src/class_loader_core.cpp#L374
Additionally, according to the comment class_loader_core.cpp:399, the global factory map map might have multiple pointers to the same address. https://github.com/ros/class_loader/blob/melodic-devel/src/class_loader_core.cpp#L399
I wonder if it is the right way to delete pointers in the destructor. It's going to take a little longer.
I cancel this PR. As we discussed above, We have two problems:
(1) leaks in BaseToFactoryMapMap. I found Graveyard also has the same leak.
(2) ~AbstractMetaObjectBase
should be virtual (as AbstractMetaObjectBase
has vtable).
We may fix (1) by a custom deleter or smart pointers. But I couldn't address (2) because:
~AbstractMetaObjectBase
, MetaObject should have a destructor.plugin.so
.Additionally, I initially referred to the leaks of composition/test_linktime_composition
, but it may be an another problem.
This test is about libraries linked by the linker (rather than dlopen)
, as in https://github.com/ros2/demos/blob/master/composition/src/linktime_composition.cpp#L40.
But the class_loader looks to be designed to be used with dlopen. When we run test_linktime_composition
, class_loader
alerts as the following. And this is why MetaObject remains in BaseToFactoryMapMap.
# I set console_bridge log level = DEBUG
$ launch_test test_linktime_composition__rmw_cyclonedds_cpp_Debug.py
class_loader.impl: ALERT!!! A metaobject (i.e. factory) exists for desired class, but has no owner. This implies that the library containing the class was dlopen()ed by means other than through the class_loader interfac e. This can happen if you build plugin libraries that contain more than just plugins (i.e. normal code your app links against) -- that intrinsically will trigger a dlopen() prior to main(). You should isolate your plugins into their own library, otherwise it will not be possible to shutdown the library!
Fixes #131.
Sorry that my message is very long. And as we dont't know the historical reasons well, I'm sorry if there are mistakes.
where leak happens
It looks 'new_factory' is not deleted when
factoryMap
is deleted.Additionally,
factoryMap
is finally a typedef ofstd::map
and is stored instatic BaseToFactoryMapMap instance
which is alsostd::map
.It looks
new_factory
is not deleted wheninstance
is out of scope. We tried to use a custom deleter(please see our draft PR), then we gotdeleting object of polymorphic class type ‘class_loader::impl::AbstractMetaObjectBase’ which has non-virtual destructor might cause undefined behavior
warning. This may be because the AbstractMetaObjectBase destructor is not virtual. We found "virtual" was kept in PR53, but we didn't catch well why this should not be virtual.About AbstractMetaObjectBase non-virtual destructor
new_factory
above is a variable ofimpl::AbstractMetaObject
, which has vtable but has non-virtual distructor. This looks to invoke undefined behavior according to https://en.cppreference.com/w/cpp/language/destructor "Virtual destructors". And this is what the compiler warning says.Here is the AbstractMetaObjectsBase code:
In the destructor comment, we can see
THIS MUST NOT BE VIRTUAL AND OVERIDDEN BY TEMPLATE SUBCLASSES
.We found this comment was added at https://github.com/ros/class_loader/commit/da86427f75b9362db7b063b482fd0c6c5756496a. The commit message is
Fixed bug with redundant destructor definition being pulled into plugin library for metaobjects instead of being contained with libclass_loader.so
. Ths problem looks MetaObject destructor is placed in bothlibsome_plugin.so
andlibclass_loader.so
.As there are three related classes and two runtime libraries(plugin and libclass_loader), let me explain the details.
We describe AbstractMetaObjectBase and its children structures as below in descendants to ancestor order:
When we implement a plugin library, we use
CLASS_LOADER_REGISTER_CLASS
macro in the plugin .cpp file. This macro callsregisterPlugin
and then invokesnew impl::MetaObject<Derived, Base>
. So (1) and (2) are instantiated in plugin .cpp files, thus object codes are inlibsome_plugin.so
. On the other hand, asmeta_object.cpp
is compiled only inlibclass_loader.so
, object code of (3) is included only inlibclass_loader.so
.As template classes are instantiated in plugin .cpp, the reasonable situation may be the following:
~MetaObject
and~AbstractMetaObject
are only in plugin library(libsome_plugin.so
) and not inlibclass_loader.so
~AbstractMetaObjectBase
is only inlibclass_loader.so
and not inlibsome_plugin.so
class_loader/src/meta_object.cpp
to compilelibsome_plugin.so
, this also looks always true.compare the latest code and our code
We built the latest code as
build-asan
and our patched code asbuild-asan2
.Here are the destructor symbols defined in
libclass_loader.so
.~AbstractMetaObject
and~MetaObject
is not defined.And here are destrucotr symbols in
libplugin.so
. We choosecomposition/libclient_component.so
. Only our version has~AbstractMetaObjectBase
and~MetaObject
but they are only inlibclient_component.so
and not inlibclass_loader.so
. Note we can find~AbstractMetaObjectBase()
but it's undefined symbol so it is not also redundant.Test result
By this modification, we can resolve ASAN error.
For example, in
composition/test_linktime_composition
, we got the following errors.With our modification, there are no error.
One one more topic: dummyMethod
We also found
AbstractMetaObjectBase::dummyMethod
is in bothlibplugin.so
andlibclass_loader.so
. This may be because implemantation ofdummyMethod
is in header file. Is it better to move it to .cpp file?