Closed disRecord closed 7 years ago
We already had similar problems with other services, e.g. for the rosservice
service (see https://github.com/orocos/rtt_ros_integration/pull/60). It would be certainly better to use weak pointers for the parent to break the loop, but normally destruction works fine because the TaskContext destructor actively clears the tasks services in TaskContext.cpp:135.
However, there are many reasons why other shared pointers to a service might still exist and the service is not destroyed, e.g. when there is a CORBA TaskContextProxy.
It seems there may be two causes:
Method clear()
in Service.cpp:219 is not recursive. removeService()
is called for each subservice, but it simply reset shared pointers and does not call clear() on corresponding Services. So if any of subservices has its own subservice we have
shared_ptr` loop.
But it does not explain why my example is working properly.
Another possible cause is shared_from_raw()
: Service.hpp:109. It seems there are many pitfalls related with this method: http://stackoverflow.com/questions/22184772/why-is-boostenable-shared-from-raw-so-undocumented
I tried to replace shared_ptr parent
by weak_ptr parent
in Service.hpp
and it did not solve problem. But after I used Service * parent
and removed almost all Service::share_from_this()
calls from 'Service.cpp' the example above worked properly. So I think shared_from_raw()
is cause.
Excuse me, It seems I missed important information in original post: this issue is related with toolchain-2.9
version of OROCOS. Code in ~TaskContext is entirely different here.
Thanks for the detailed analysis!
- Method
clear()
in Service.cpp:219 is not recursive.removeService()
is called for each subservice, but it simply reset shared pointers and does not callclear() on corresponding Services. So if any of subservices has its own subservice we have
shared_ptr` loop.
You are right that this might be an issue in some scenarios, but making clear()
recursive might break existing use cases and is not a good option.
- Another possible cause is
shared_from_raw()
: Service.hpp:109. It seems there are many pitfalls related with this method: http://stackoverflow.com/questions/22184772/why-is-boostenable-shared-from-raw-so-undocumented
The article is right and relying on shared_from_raw()
is probably not the best design, but in this case I do not think it is an issue as the way services are used in Orocos the instances are always stored in a shared_ptr
immediately after construction.
I tried to replace
shared_ptr parent
byweak_ptr parent
inService.hpp
and it did not solve problem. But after I usedService * parent
and removed almost allService::share_from_this()
calls from 'Service.cpp' the example above worked properly. So I thinkshared_from_raw()
is cause.
Introducing weak_ptr
for the parent is a good option and along the lines of https://github.com/orocos-toolchain/rtt/pull/193. There are also other use cases, like master/slave activity relationships (https://github.com/orocos-toolchain/rtt/issues/178). On the other hand quite some code has to be checked against the case where getOwner()
or getParent()
might return null then also considering thread-safety. But that's still better than keep them around or have dangling raw pointers.
Excuse me, It seems I missed important information in original post: this issue is related with toolchain-2.9 version of OROCOS. Code in ~TaskContext is entirely different here.
You are right and the removal of the clear() call from ~TaskContext()
in https://github.com/orocos-toolchain/rtt/commit/42a62e8d7b81cebcf9c34e012b339a302c200033#diff-a739de4d537602c73893d46c20da0a31L139 might be the issue. I will prepare a PR to re-add it and run some tests.
Introducing weak_ptr for the parent is a good option and along the lines of #193.
I tried it. For our project it worked fine. Service destructors are called in all cases: for global service subservice and for nested services of components. So from my viewpoint it solves the case.
Also I bumped in some issues with OCL (incorrect ROS messages decomposition in reporting components, inability to display service interface in rttlua console). If you want I can provide patches with fixes or walkarounds for them.
I merged #204 into the toolchain-2.9 branch. Feel free to open new issues or PRs for the other things you mentioned. Thanks!
This behavior can be reproduced by simple test
and then load
myservice
in any component and shutdown deployer.It seems to be caused by boost::shared_ptr loop in RTT::Service definition (http://github.com/orocos-toolchain/rtt/blob/master/rtt/Service.hpp#L576):