Closed xibeisiber closed 1 year ago
Hi, I think this is not an issue/question of this demo-repository but of https://github.com/ros-controls/ros2_control directly.
My experience on the termination with ctrl+c is that it is not really reproducible what happens (independent of ros2_control) ;)
I tried
diffbot.launch.py
andrrbot_system_position_only.launch.py
. The printing information inon_activate()
is printed out in console, but when I "ros2 control set_controller_state forward_position_controller inactive" and "ros2 control set_controller_state diffbot_base_controller inactive", the printing information in "on_deactivate()" is not printed in console.
Oh, this is very strange. this should definitely not happen. This part of ros2_control is well tested. I would not expect bugs there, nevertheless it is possible. I will test this when testing new restructured examples in #232 to be sure. Can you check the example one from this PR and report if you get the same behavior?
Another question is: will
on_deactivate()
be automatically called when I terminate the startup node (press ctrl+c indiffbot.launch.py
)?
I cannot answer you in detail – I know that we had a discussion about this and about catching events. But for what I know, we are not doing a graceful “shutdown” of the framework. It is definitely something to think about, but until now, it wasn't really an issue in all the robots we have tested this.
This functionality has to be shaped well since it would include all standard and user-components to have pre-defined behavior.
I tried
diffbot.launch.py
andrrbot_system_position_only.launch.py
. The printing information inon_activate()
is printed out in console, but when I "ros2 control set_controller_state forward_position_controller inactive" and "ros2 control set_controller_state diffbot_base_controller inactive", the printing information in "on_deactivate()" is not printed in console.Oh, this is very strange. this should definitely not happen. This part of ros2_control is well tested. I would not expect bugs there, nevertheless it is possible. I will test this when testing new restructured examples in #232 to be sure. Can you check the example one from this PR and report if you get the same behavior?
Another question is: will
on_deactivate()
be automatically called when I terminate the startup node (press ctrl+c indiffbot.launch.py
)?I cannot answer you in detail – I know that we had a discussion about this and about catching events. But for what I know, we are not doing a graceful “shutdown” of the framework. It is definitely something to think about, but until now, it wasn't really an issue in all the robots we have tested this.
This functionality has to be shaped well since it would include all standard and user-components to have pre-defined behavior.
Yes. I can reproduce the behavior and find a new problem.. Environment: ros2 humble, ubuntu 22.04, x86_64. The ros2_control_demo is in commit 1007b1dfe44f4146e64b04666a1164adeaf8ffa1.
I comment the print information of read()&write() in diffbot_system.cpp. When launching diffbot.launch.py
, the info in on_activate()
is printed out.
When ros2 control set_controller_state diffbot_base_controller inactive
, the info in on_deactivate()
is not printed out. But this controller is successfully disactived by "ros2 control list_controllers".
When ros2 control set_controller_state diffbot_base_controller active
again, the node dies. error message is:
[INFO] [spawner-4]: process has finished cleanly [pid 17612]
[ros2_control_node-1] Stack trace (most recent call last) in thread 17583:
[ros2_control_node-1] #6 Object "", at 0xffffffffffffffff, in
[ros2_control_node-1] #5 Source "../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 81, in __clone3 [0x7f62867269ff]
[ros2_control_node-1] #4 Source "./nptl/pthread_create.c", line 442, in start_thread [0x7f6286694b42]
[ros2_control_node-1] #3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f6286adc2b2, in
[ros2_control_node-1] #2 Object "/opt/ros/humble/lib/controller_manager/ros2_control_node", at 0x555b4ec873db, in
[ros2_control_node-1] #1 Object "/opt/ros/humble/lib/libcontroller_manager.so", at 0x7f6286efe7a7, in controller_manager::ControllerManager::update(rclcpp::Time const&, rclcpp::Duration const&)
[ros2_control_node-1] #0 Object "/opt/ros/humble/lib/libdiff_drive_controller.so", at 0x7f627c3baf89, in diff_drive_controller::DiffDriveController::update(rclcpp::Time const&, rclcpp::Duration const&)
[ros2_control_node-1] Segmentation fault (Signal sent by the kernel [(nil)])
[ERROR] [ros2_control_node-1]: process has died [pid 17555, exit code -11, cmd '/opt/ros/humble/lib/controller_manager/ros2_control_node --ros-args --params-file /tmp/launch_params__3imcxhq --params-file /home/jia/Workspace/ros2_ws/install/ros2_control_demo_bringup/share/ros2_control_demo_bringup/config/diffbot_controllers.yaml'].
I can reproduce your behavior.
Honestly I'm not sure about the lifecycle of the hardware interface, should it be activated/deactivated automatically if doing so with the controllers? @destogl?
The segmentation fault happens only with the diffbot
example. I do not see a difference from rrbot
with diffbot
hardware from this repository, according to the stack trace (backward_ros
ftw) the diff_drive_controller
throws the error caused from a nullptr? I'll have a look if there is a difference in the lifecycle of the different controllers.
[ros2_control_node-1] Stack trace (most recent call last) in thread 16525:
[ros2_control_node-1] #6 Object "", at 0xffffffffffffffff, in
[ros2_control_node-1] #5 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f662cee09ff, in
[ros2_control_node-1] #4 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f662ce4eb42, in
[ros2_control_node-1] #3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f662d0de2b2, in
[ros2_control_node-1] #2 Source "/workspaces/ros2_rolling_ws/src/ros2_control/controller_manager/src/ros2_control_node.cpp", line 81, in operator() [0x55f1261e40d0]
[ros2_control_node-1] 79: // execute update loop
[ros2_control_node-1] 80: cm->read(cm->now(), measured_period);
[ros2_control_node-1] > 81: cm->update(cm->now(), measured_period);
[ros2_control_node-1] 82: cm->write(cm->now(), measured_period);
[ros2_control_node-1] 83:
[ros2_control_node-1] 84: // wait until we hit the end of the period
[ros2_control_node-1] #1 Source "/workspaces/ros2_rolling_ws/src/ros2_control/controller_manager/src/controller_manager.cpp", line 1741, in update [0x7f662d4f3957]
[ros2_control_node-1] 1739: if (controller_go)
[ros2_control_node-1] 1740: {
[ros2_control_node-1] >1741: auto controller_ret = loaded_controller.c->update(
[ros2_control_node-1] 1742: time, (controller_update_rate != update_rate_ && controller_update_rate != 0)
[ros2_control_node-1] 1743: ? rclcpp::Duration::from_seconds(1.0 / controller_update_rate)
[ros2_control_node-1] 1744: : period);
[ros2_control_node-1] #0 | Source "/workspaces/ros2_rolling_ws/src/ros2_controllers/diff_drive_controller/src/diff_drive_controller.cpp", line 249, in set_value
[ros2_control_node-1] | 247: for (size_t index = 0; index < static_cast<size_t>(params_.wheels_per_side); ++index)
[ros2_control_node-1] | 248: {
[ros2_control_node-1] | > 249: registered_left_wheel_handles_[index].velocity.get().set_value(velocity_left);
[ros2_control_node-1] | 250: registered_right_wheel_handles_[index].velocity.get().set_value(velocity_right);
[ros2_control_node-1] | 251: }
[ros2_control_node-1] | Source "/workspaces/ros2_rolling_ws/src/ros2_control/hardware_interface/include/hardware_interface/loaned_command_interface.hpp", line 66, in set_value
[ros2_control_node-1] | 64: const std::string & get_prefix_name() const { return command_interface_.get_prefix_name(); }
[ros2_control_node-1] | 65:
[ros2_control_node-1] | > 66: void set_value(double val) { command_interface_.set_value(val); }
[ros2_control_node-1] | 67:
[ros2_control_node-1] | 68: double get_value() const { return command_interface_.get_value(); }
[ros2_control_node-1] Source "/workspaces/ros2_rolling_ws/src/ros2_control/hardware_interface/include/hardware_interface/handle.hpp", line 111, in update [0x7f662417a0f1]
[ros2_control_node-1] 109: void set_value(double value)
[ros2_control_node-1] 110: {
[ros2_control_node-1] > 111: THROW_ON_NULLPTR(this->value_ptr_);
[ros2_control_node-1] 112: *this->value_ptr_ = value;
[ros2_control_node-1] 113: }
[ros2_control_node-1] 114: };
[ros2_control_node-1] Segmentation fault (Signal sent by the kernel [(nil)])
I used this commit https://github.com/christophfroehlich/ros2_control_demos/tree/issue_239 and the current rolling branches of ros2_control and ros2_controllers
I tried to investigate the segmentation fault, but did not find the source. What I did: I extended the diff_drive_controller test, but this one does not throw any errors.
state = controller_->get_node()->deactivate();
ASSERT_EQ(state.id(), State::PRIMARY_STATE_INACTIVE);
ASSERT_EQ(
controller_->update(rclcpp::Time(0, 0, RCL_ROS_TIME), rclcpp::Duration::from_seconds(0.01)),
controller_interface::return_type::OK);
// activate again
state = controller_->get_node()->activate();
ASSERT_EQ(State::PRIMARY_STATE_ACTIVE, state.id());
publish(linear, angular);
// wait for msg is be published to the system
ASSERT_TRUE(controller_->wait_for_twist(executor));
ASSERT_EQ(
controller_->update(rclcpp::Time(0, 0, RCL_ROS_TIME), rclcpp::Duration::from_seconds(0.01)),
controller_interface::return_type::OK);
How can the set_value be called but the this-pointer be obsolete? The on_activate
/on_deactivate
methods of the diffbot_system do not change the pointers. Anyhow, the problem seems to be with the handles, and those are not managed by the hardware-interface of ros2_control_demo_example_2
.
@destogl or @bmagyar: Any other suggestions how to narrow that down?
@christophfroehlich
We have here potentially multiple issues.
nullptr
– it seems that an interface or its value get invalidated in on_activate()
method. (I would bet this is the issue) - in your test you are missing deactivation of controller.SystemInterface::on_deactivate
is actually called?How can the set_value be called but the this-pointer be obsolete? The
on_activate
/on_deactivate
methods of the diffbot_system do not change the pointers. Anyhow, the problem seems to be with the handles, and those are not managed by the hardware-interface ofros2_control_demo_example_2
.
This sounds like quite a deep problem. Maybe something in the Resource Manager when we change interface flags is not right. But we are not moving them or invalidating them.
Do we have a test for this in Resource Manager?
Hi, I've created that PR for ros2_controllers
that solves the issue for me, please check it out. :)
- DiffDriveController is not deactivated, but the hardware is deactivated. This causes
nullptr
– it seems that an interface or its value get invalidated inon_activate()
method. (I would bet this is the issue) - in your test you are missing deactivation of controller.
@destogl to clarify this: the segmentation fault happens when the controller is deactivated while the hw component being still active (there are two topics mixed in this issue). I'll have a look on the PR of @Noel215, I fear I still don't understand the handle registration sufficiently.
- Automatic deactivation of controller is not implemented when hardware gets deactivated. Presently, this will not even be detected since the interface is still there is hardware is inactive.
ok
- The lifecycle of hardware should work as expected. One should check if override of the methods is done correctly. Nevertheless, we might be missing something here… can you check the tests and see if they should be extended to detect if an implementation of
SystemInterface::on_deactivate
is actually called?
We don't have any tests for the hardware components in this repo yet, see #46! do you have a hint to examples of tests we should add here?
I'll close this issue now, summary from the initial post of @xibeisiber :
lets discuss the other points in the linked issues
Hi, I tried
diffbot.launch.py
andrrbot_system_position_only.launch.py
. The printing information inon_activate()
is printed out in console, but when I "ros2 control set_controller_state forward_position_controller inactive" and "ros2 control set_controller_state diffbot_base_controller inactive", the printing information in "on_deactivate()" is not printed in console.Another question is: will
on_deactivate()
be automatically called when I terminate the startup node (press ctrl+c indiffbot.launch.py
)?Thanks! Jia