Open fmauch opened 4 months ago
I can't reproduce it with a clean build of colcon build --packages-up-to ros2_control_demo_example_14
Thanks for checking. Then I'll look around locally.
I can't reproduce it as well. It works for me :)
OK, the problems seems indeed to come from my change in #456. Looks like when using the description topic it tries to send commands to the hardware before it is activated.
[ros2_control_node-1] [INFO] [1709235162.311241783] [RRBotActuatorWithoutFeedback]: Activating ...please wait...
[ros2_control_node-1] [INFO] [1709235162.311684397] [RRBotSensorPositionFeedback]: Receiving data
[ros2_control_node-1] [INFO] [1709235162.312043305] [RRBotSensorPositionFeedback]: Receiving data
[ros2_control_node-1] [INFO] [1709235162.317357018] [RRBotActuatorWithoutFeedback]: Writing command: nan
[ros2_control_node-1] [INFO] [1709235162.317383404] [RRBotActuatorWithoutFeedback]: Sending data command: nan
[ros2_control_node-1] [INFO] [1709235162.317411609] [RRBotActuatorWithoutFeedback]: Joints successfully written!
vs. using the description parameter:
[ros2_control_node-1] [INFO] [1709406595.828694734] [RRBotActuatorWithoutFeedback]: Activating ...please wait...
[spawner-3] [INFO] [1709406596.132883688] [spawner_joint_state_broadcaster]: Waiting for '/controller_manager' services to be available
[ros2_control_node-1] [INFO] [1709406596.828821648] [RRBotActuatorWithoutFeedback]: 2.0 seconds left...
[ros2_control_node-1] [INFO] [1709406597.829122935] [RRBotActuatorWithoutFeedback]: 1.0 seconds left...
[ros2_control_node-1] [INFO] [1709406597.829185501] [RRBotActuatorWithoutFeedback]: Successfully activated!
[ros2_control_node-1] [INFO] [1709406597.829215398] [resource_manager]: Successful 'activate' of hardware 'RRBotModularJoint1'
[ros2_control_node-1] [INFO] [1709406597.829412333] [resource_manager]: 'configure' hardware 'RRBotModularPositionSensorJoint2'
[ros2_control_node-1] [INFO] [1709406597.829475210] [RRBotSensorPositionFeedback]: Configuration successful.
[ros2_control_node-1] [INFO] [1709406597.829491110] [resource_manager]: Successful 'configure' of hardware 'RRBotModularPositionSensorJoint2'
[ros2_control_node-1] [INFO] [1709406597.829526538] [resource_manager]: 'activate' hardware 'RRBotModularPositionSensorJoint2'
[ros2_control_node-1] [INFO] [1709406597.829540464] [RRBotSensorPositionFeedback]: Activating ...please wait...
[spawner-3] [INFO] [1709406598.160180152] [spawner_joint_state_broadcaster]: Waiting for '/controller_manager' services to be available
[ros2_control_node-1] [INFO] [1709406598.829744724] [RRBotSensorPositionFeedback]: 1.0 seconds left...
[ros2_control_node-1] [INFO] [1709406598.829808242] [RRBotSensorPositionFeedback]: Successfully activated!
[ros2_control_node-1] [INFO] [1709406598.829836718] [resource_manager]: Successful 'activate' of hardware 'RRBotModularPositionSensorJoint2'
[ros2_control_node-1] [INFO] [1709406598.829935389] [resource_manager]: 'configure' hardware 'RRBotModularPositionSensorJoint1'
[ros2_control_node-1] [INFO] [1709406598.829959936] [RRBotSensorPositionFeedback]: Configuration successful.
[ros2_control_node-1] [INFO] [1709406598.829973430] [resource_manager]: Successful 'configure' of hardware 'RRBotModularPositionSensorJoint1'
[ros2_control_node-1] [INFO] [1709406598.830003075] [resource_manager]: 'activate' hardware 'RRBotModularPositionSensorJoint1'
[ros2_control_node-1] [INFO] [1709406598.830015991] [RRBotSensorPositionFeedback]: Activating ...please wait...
[ros2_control_node-1] [INFO] [1709406599.830308578] [RRBotSensorPositionFeedback]: 1.0 seconds left...
[ros2_control_node-1] [INFO] [1709406599.830371781] [RRBotSensorPositionFeedback]: Successfully activated!
[ros2_control_node-1] [INFO] [1709406599.830399869] [resource_manager]: Successful 'activate' of hardware 'RRBotModularPositionSensorJoint1'
[ros2_control_node-1] [INFO] [1709406599.842852174] [controller_manager]: update rate is 100 Hz
[ros2_control_node-1] [WARN] [1709406599.842956793] [controller_manager]: Could not enable FIFO RT scheduling policy. Consider setting up your user to do FIFO RT scheduling. See [https://control.ros.org/master/doc/ros2_control/controller_manager/doc/userdoc.html] for details.
[ros2_control_node-1] [INFO] [1709406599.843048536] [RRBotSensorPositionFeedback]: Reading...
[ros2_control_node-1] [INFO] [1709406599.843066684] [RRBotSensorPositionFeedback]: Got measured velocity nan
[ros2_control_node-1] [INFO] [1709406599.843072846] [RRBotSensorPositionFeedback]: Got state 0.00000 for joint 'joint1'!
[ros2_control_node-1] [INFO] [1709406599.843078447] [RRBotSensorPositionFeedback]: Joints successfully read!
[ros2_control_node-1] [INFO] [1709406599.843083413] [RRBotSensorPositionFeedback]: Reading...
[ros2_control_node-1] [INFO] [1709406599.843086367] [RRBotSensorPositionFeedback]: Got measured velocity nan
[ros2_control_node-1] [INFO] [1709406599.843089180] [RRBotSensorPositionFeedback]: Got state 0.00000 for joint 'joint2'!
[ros2_control_node-1] [INFO] [1709406599.843092310] [RRBotSensorPositionFeedback]: Joints successfully read!
[ros2_control_node-1] [INFO] [1709406599.843112064] [RRBotActuatorWithoutFeedback]: Writing command: 0.000000
[ros2_control_node-1] [INFO] [1709406599.843128883] [RRBotActuatorWithoutFeedback]: Sending data command: 0
[ros2_control_node-1] [INFO] [1709406599.843169503] [RRBotActuatorWithoutFeedback]: Joints successfully written!
I didn't consider this at first, as I merely changed the way the description was fed and the node ended up with the correct description. However, using the description from the topic completely changes initialization. I'll update the title accordingly.
However, the question is: Where's the fault? It seems weird that write()
is actually being called before the hardware is actually active.
This demo example is an interesting special case since the on_activate()
artificially takes a couple of seconds which is usually probably not the case.
Or is it a requirement that on_activate()
is required to be real-time-safe and run in one control cycle? Now that I'm aware of the reason I remember I've seen something similar on the UR driver before we restructured things in on_configure
and on_activate
correctly.
So my main question would be: Is this behavior to be expected and the requirements for the on_activate()
aren't fulfilled or do we have to update something in the controller_manager / resource_manager?
Edit: I think, I understand now why this happens. During initialization, the hw interface sets the buffer command to NaN
:
Then, in on_activate()
, there is an artificial sleep to "simulate" processes taking some time. After that, the command buffer is set to 0.0
:
Since urdf loading and activating the hardware isn't blocking the update()
loop and the resource manager calls write()
for all command interface independent of their state.
So, one way of "fixing" this short term would be a check in the write()
method whether the command is NaN
and long term to finally implement command interfaces that are not available for writing when not being active as discussed in https://github.com/ros-controls/ros2_control/pull/884
Thanks for analyzing this.
This demo example is an interesting special case since the
on_activate()
artificially takes a couple of seconds which is usually probably not the case.Or is it a requirement that
on_activate()
is required to be real-time-safe and run in one control cycle? Now that I'm aware of the reason I remember I've seen something similar on the UR driver before we restructured things inon_configure
andon_activate
correctly.So my main question would be: Is this behavior to be expected and the requirements for the
on_activate()
aren't fulfilled or do we have to update something in the controller_manager / resource_manager?
I just checked the docs about that, (here or here). It seems that this should be real-time safe:
on_configure() - reads parameters and configures controller.
on_activate() - called when controller is activated (started) (real-time)
on_deactivate() - called when controller is deactivated (stopped) (real-time)
Is there a difference from controllers and hardware components regarding this? We should check if this is really the case in the CM (@saikishor maybe?), clarify that in the docs here for example and fix the examples of the demos.
Since urdf loading and activating the hardware isn't blocking the
update()
loop and the resource manager callswrite()
for all command interface independent of their state.So, one way of "fixing" this short term would be a check in the
write()
method whether the command isNaN
and long term to finally implement command interfaces that are not available for writing when not being active as discussed in ros-controls/ros2_control#884
I'm fine with the fix, and agree for the need for the "final" solution upstream.
Short term solution now included in #456
Hello!
I'd rather go with a proper fix, because if this happened in the demos, this might happen in reality, we would need to check why it is happening and find a proper solution for upstream. What do you guys think?
Yes, I think a proper fix would be better. That's why I didn't mark this issue to be resolved by #456. I just like this issue to not block migrating towards the topic inside the demos.
After looking at the code, I believe this is happening because we call the write
method also for the inactive
components
https://github.com/ros-controls/ros2_control/blob/master/hardware_interface%2Fsrc%2Fsystem.cpp#L247 and https://github.com/ros-controls/ros2_control/blob/master/hardware_interface%2Fsrc%2Factuator.cpp#L251
The only fix I can think of is how @fmauch has handled it in the hardware itself. I'm wondering why this is not happening with the parameter. I would like to check this part.
If we need to properly fix it, I think we might need a new method called dynamic_configure
where all the gpio
interfaces go in and then the main hardware command interfaces can go in the write
method, in this way we can call this new dynamic_configure
method and avoid calling write
method in the inactive
state.
Describe the bug Without further investigating this it seems that demo14 is currently broken at least on the master branch.
To Reproduce Run demo 14 as described in the docs
Update: To reproduce use the version from #456.
Output:
Expected behavior I would expect things not to crash ;-)
Environment (please complete the following information): ROS 2 rolling with ros2_control workspace with current master everywhere:
Additional context I didn't investigate this all too much but did not want that to get lost on the way.