Open quning18 opened 7 years ago
There is a talk about Fault Tolerance in ROScon2016 as you mention about 1. , utilizing DMTCP check the video https://vimeo.com/187699448 DMTCP: http://dmtcp.sourceforge.net Not yet really dive in it.
And I saw ROS2 beta2 binary release few hours ago. https://github.com/ros2/ros2/releases/tag/release-beta2 I wonder did someone port it in ARM platform like raspberry pi3 already?
@quning78 The ROS master is a core entity in a ROS system. It is designed as a singleton and a lot of different APIs are based on that. I don't think that it is feasible to replace it with a decentralized solution without refactoring a lot of code. And even than the existing API needs to be maintained for backward compatibility. That is one of the reason we decided to deploy such drastic changes in ROS 2 and not change ROS 1 with the extreme likelihood of breaking behavior and introducing severe regressions. See the last paragraph in one of the ROS 2 design docs for more information about the rational: http://design.ros2.org/articles/why_ros2.html
checkpointing the master and so it can be recovered after crash
An improvement like this to make the current master more resilient in case of problems is much easier to integrate. I don't see any potential for problems (beside the usual regressions). So something like this would absolutely be in the scope of ROS 1.
I wonder did someone port it in ARM platform like raspberry pi3 already?
@YuehChuan The ROS 2 can be built for armv8. The beta 2 also has Debian packages for armv8. Please keep this thread on topic and ask unrelated question in the appropriate places.
Hi, All,
As we know, RTPS is has been the standard middleware for ROS2 and it actually helps to reduce the responsibility of ros master.
The reason we like this idea, is to remove the single point failure. In current ROS, the master is the key of the whole system, and if master crashes nothing can be done to recover from it.
We have been thinking about different options:
After studying the ROS2, rtps actually helps a lot in this direction. Again, we know it's part of the feature of ROS2, however, due to the popularity of current ROS, maybe there is something we can do to remove most dependency from node to master, so the reliability concern will be dramatically reduced.
Any inputs are welcome.