Open tylerjw opened 2 years ago
As someone who has written many of these kinds of loops: whether to execute read(..)
before write(..)
(with some update(..)
in the middle) or the other way around is largely a choice.
Your second diagram orders operations such that the exact point in time where write(..)
is called will depend on how long update(..)
takes. That's not ideal. Systems which do not have lower-level (sub)systems with fixed sampling periods (which your diagram sort-of suggests you have) could see large variance in when they get their control inputs.
I've seen write(..); update(..); read(..); sleep(..)
and write(..); read(..); update(..); sleep(..)
to mitigate this.
There is no single correct order.
I'm happy to re-order them in my PR. You will note though that the current design also has the issue you point out along with the other issues I pointed out. In the current design, write is always called after sleep/read/update.
You will note though that the current design also has the issue you point out along with the other issues I pointed it.
I wasn't claiming it doesn't.
I merely wanted to make sure we agree here that while it seems write(..)
should follow read(..)
and update(..)
, that's not necessarily the case, and depends on the structure and configuration of the rest of the (underlying) system.
I pushed a change to make it write, update, read, sleep
. I hadn't thought of what you pointed out but you are right that the hardware I'm interacting with is sensitive to write being called at a steady rate.
Now we wait until someone comes along with a good argument for why it shouldn't be the order you just made the default ..
(such as when using the external motion interface of an industrial robot, where you have to make sure you're sending new references before the expiration of the current period and you're using the external interface to pace your own control loop. If you wait with write(..)
till the "next period", you're too late)
The only reason I can think of is if there are some assumptions about the order of read
and update
. There might be a controller that depends on the fact that the values it is reading from the robot are the most recent ones (and not the ones from the last iteration of the loop).
My primary concern here was to fix what I saw as broken logic in how the loop was written with respect to the time and duration values being passed to the controllers. I'm happy to also fix other issues here while I'm at it though.
When I asked for feedback on my issue I posted here one thing I heard internally here at PickNik my generic design for ControlLoop feels too complex when I could just fix the logic in the main function. My counter to that is fixing the logic in the main function alone makes it much much harder for me to test the change I want to make here in any sort of robust or minimal way.
Another thing to point out is I'm using templates for the function arguments instead of std::function. That is a deliberate choice with the only pitfall being potentially harder-to-read compile errors if this function was called with arguments that couldn't be made to compile.
The upside is that having the function arguments be templates is much lighter weight than std::function. Ideally, I'd have C++20 concepts and I could constrain these template arguments in a way that would produce nice error messages.
My primary concern here was to fix what I saw as broken logic in how the loop was written with respect to the time and duration values being passed to the controllers.
I'm not a maintainer nor developer here, but then my suggestion would be to make this PR about those fixes.
Don't change anything else.
When I asked for feedback on my issue I posted here one thing I heard internally here at PickNik my generic design for ControlLoop feels too complex when I could just fix the logic in the main function. My counter to that is fixing the logic in the main function alone makes it much much harder for me to test the change I want to make here in any sort of robust or minimal way.
ask 100 developers their opinion on some implementation and you'll get a 1000.
Whether something is complex depends on level of experience and skill.
Personally I don't believe it's too complex, but it's undeniable it's more complex than what used to be there.
However, limitations of the implementation language and the additional requirements you've stated (dependency injection design to improve testability) will require some 'tricks', which unfortunately would lead to more perceived complexity.
I'm not sure but I think it is clear that there are problems in the cm_thread codes in ros2_control_node.cpp.
while (rclcpp::ok())
{
// wait until we hit the end of the period
end_period += period;
std::this_thread::sleep_for(std::chrono::nanoseconds((end_period - cm->now()).nanoseconds()));
// execute update loop
auto period = current_time - previous_time; // (Problem here : current_time and previous_time are same => period is always zero !!!)
cm->read(current_time, period);
current_time = cm->now();
cm->update(current_time, period);
previous_time = current_time;
cm->write(current_time, period);
}
The problems are : current_time and previous_time are same so, period is always zero....;;
I'm not sure whether it is a proper solution or not, but I just modified it as follows and it looks ok maybe.
while (rclcpp::ok())
{
// wait until we hit the end of the period
end_period += period; // cur_time + interval
std::this_thread::sleep_for(
std::chrono::nanoseconds((end_period - cm->now()).nanoseconds()));
// execute update loop
current_time = cm->now(); // read cur. time (Modify here : current_time must be fixed to avoid zero period value !!!!!)
auto period = current_time - previous_time;
cm->read(current_time, period);
//current_time = cm->now(); // read cur. time (comment !!!)
cm->update(current_time, period);
//previous_time = current_time; (comment !!!)
cm->write(current_time, period);
previous_time = current_time; (modifiy here but it is not important!!!!)
}
Result : Example 2: "Robots with multiple interfaces"
Since the update rate is 100 Hz, (which I added for debug) period_seconds : about 0.01 [s] seems to be ok.
[ros2_control_node-1] [INFO]: period_seconds: 0.01001, hw_slowdown: 50.00000! [ros2_control_node-1] [INFO]: Got the commands pos: 0.00000, vel: 5.00000, acc: 0.00000 for joint 0, control_lvl:2 [ros2_control_node-1] [INFO]: Got the commands pos: 0.00000, vel: 5.00000, acc: 0.00000 for joint 1, control_lvl:2 [ros2_control_node-1] [INFO]: Got pos: 23.46400, vel: 5.00000, acc: 0.00000 for joint 0! [ros2_control_node-1] [INFO]: period_seconds: 0.00999, hw_slowdown: 50.00000! [ros2_control_node-1] [INFO]: Got pos: 23.46400, vel: 5.00000, acc: 0.00000 for joint 1! [ros2_control_node-1] [INFO]: period_seconds: 0.00999, hw_slowdown: 50.00000! [ros2_control_node-1] [INFO]: Got the commands pos: 0.00000, vel: 5.00000, acc: 0.00000 for joint 0, control_lvl:2 [ros2_control_node-1] [INFO]: Got the commands pos: 0.00000, vel: 5.00000, acc: 0.00000 for joint 1, control_lvl:2 [ros2_control_node-1] [INFO]: Got pos: 23.46500, vel: 5.00000, acc: 0.00000 for joint 0!
Studying ROS 2 is not easy to me. Anyway, the while loop of cm_thead in ros2_control_node.cpp looks like it needs to be fixed. I hope it helps to improve the Ros 2 control (Normal & stable verion corresponding to the Humble is helpful for ROS beginner :) )
@kshfire can you try my pr here to see if it fixes those issues you observed? https://github.com/ros-controls/ros2_control/pull/728
Here is my current understanding of how the logic in ros2_control_node works:
There are several issues I see with this current design:
current_time
andprevious_time
values are always the same.time
value sent to read/update/write is the time between the call toread
but beforeupdate
.period
variable is shadowed inside the loop which makes it hard to understand at first where the value is coming from.While the current design does not cause drift, the logic around these time and duration values is really confusing and hard to understand.
Proposed changes
I propose to refactor the logic of this loop into a higher-order function (a function that takes functions as arguments) that can be tested. I will work out a diagram of how this new loop works and include unit tests of it.
Here is my proposed change to the loop structure:
Here is some pseudocode of how it will work: