ros2 / demos

Apache License 2.0
493 stars 329 forks source link

fix(action_tutorials_cpp): Do not drop future returned by async_send_… #649

Closed jmachowinski closed 5 months ago

jmachowinski commented 1 year ago

…goal

This is part of https://github.com/ros2/rclcpp/pull/2296

fujitatomoya commented 12 months ago

CI:

clalancette commented 11 months ago

So we talked about this in the Waffle meeting, and I also played around a bit with this.

When we discussed it, it was brought up that holding onto the future this way is kind of odd. The future is usually used to wait for something, not keep something alive. Indeed, the important thing here is that we are keeping the handle alive, not the share_future. So we talked about using some pseudo-code that looked like this:

void send_goal()
{
  ...

  std::shared_future<GoalHandleFibonacci::SharedPtr> future = this->client_ptr_->async_send_goal(goal_msg, send_goal_options);
  future.wait();
  goal_handle_ = future.get();
}

And then have a class member variable:

GoalHandleFibonacci::SharedPtr goal_handle_;

However, this has two problems:

  1. You can't block like that in send_goal, since it is a timer callback. The reason is that this is a single-threaded executor by default, so the executor can never run to complete your future. This is possible to workaround by using a state machine and checking valid instead, though.
  2. If you change this to a state machine, then things work. However, it is kind of odd that we can get the GoalHandle::SharedPtr in 2 different ways; either by waiting for the future, or by waiting for the goal_response_callback. Theoretically, it looks like you could hold onto the goal handle via either mechanism. The question is whether the rclcpp action client machinery keeps the GoalHandle::SharedPtr valid long enough to make it to the goal_response_callback. If it does, then the easier solution is to just save the handle during that callback. If it doesn't, then we should consider what is happening in there and what we want the API to look like.
jmachowinski commented 11 months ago

So we talked about this in the Waffle meeting, and I also played around a bit with this.

Uhm, I must say, that this feels odd on my end. I am the person pushing to fix this, and now I get the information that there was a meeting behind closed doors, were you discussed something and now expect me, to implement this ? Just saying, an invitation to the meeting would have been nice...

When we discussed it, it was brought up that holding onto the future this way is kind of odd. The future is usually used to wait for something, not keep something alive. Indeed, the important thing here is that we are keeping the handle alive, not the share_future.

I agree, the API is odd. In general, I would like to point out that the whole async API is a pain to use, and one should think about a redesign.

2. If you change this to a state machine, then things work.  However, it is kind of odd that we can get the `GoalHandle::SharedPtr` in 2 different ways; either by waiting for the future, or by waiting for the `goal_response_callback`.  Theoretically, it looks like you could hold onto the goal handle via either mechanism.  The question is whether the rclcpp action client machinery keeps the `GoalHandle::SharedPtr` valid long enough to make it to the `goal_response_callback`.  If it does, then the easier solution is to just save the handle during that callback.  If it doesn't, then we should consider what is happening in there and what we want the API to look like.

Without holding the future, you won't make it to the response callback.

What about inverting the whole thing ?

Return the GoalHandle::SharedPtr directly and make the reqest future a member of the goal handle.

clalancette commented 11 months ago

Uhm, I must say, that this feels odd on my end. I am the person pushing to fix this, and now I get the information that there was a meeting behind closed doors, were you discussed something and now expect me, to implement this ?

I don't think anyone asked you to implement this. I was merely providing information.

fujitatomoya commented 11 months ago

@jmachowinski @clalancette

You can't block like that in send_goal, since it is a timer callback.

do we really need the timer here as demonstration? it shuts down when it gets result anyway.

If it does, then the easier solution is to just save the handle during that callback. Without holding the future, you won't make it to the response callback.

i think we can do that. it works with current demo fibonacci_action_client. (meaning w/o holding future, we can still see goal handle alive in response callback.) @jmachowinski am i mistaken here?

changing the demo to block through the future is also fine. but i think originally this demo wants to show how to use callbacks for action.

jmachowinski commented 11 months ago

Blocking would open a whole new can of worms. You need to put the action into a second callback group or the future will never return an result. Therefore you would need to explain / introduce the concept of callback groups first.

This would also open the gates of multithreading hell for anyone not using a single threaded executor. Keep in mind, this tutorial is the primary source of how the use the API and will be copy and pasted around.

jmachowinski commented 11 months ago

@fujitatomoya

i think we can do that. it works with current demo fibonacci_action_client. (meaning w/o holding future, we can still see goal handle alive in response callback.) @jmachowinski am i mistaken here?

It will stop to work the second we patch the bug, that we kept an internal copy of the goal handle until the result CB was called.

fujitatomoya commented 11 months ago

@jmachowinski thanks!

It will stop to work the second we patch the bug,

yeah, originally this was brought up from that. just checking if i am not missing anything here.

that we kept an internal copy of the goal handle until the result CB was called.

how about this? (this seems to be good example, save the goal handle when it get accepted, instead of saving future that is never waited on?)

diff --git a/action_tutorials/action_tutorials_cpp/src/fibonacci_action_client.cpp b/action_tutorials/action_tutorials_cpp/src/fibonacci_action_client.cpp
index c4cf996..ad630d3 100644
--- a/action_tutorials/action_tutorials_cpp/src/fibonacci_action_client.cpp
+++ b/action_tutorials/action_tutorials_cpp/src/fibonacci_action_client.cpp
@@ -79,6 +79,7 @@ public:
 private:
   rclcpp_action::Client<Fibonacci>::SharedPtr client_ptr_;
   rclcpp::TimerBase::SharedPtr timer_;
+  GoalHandleFibonacci::SharedPtr goal_handle_;

   ACTION_TUTORIALS_CPP_LOCAL
   void goal_response_callback(GoalHandleFibonacci::SharedPtr goal_handle)
@@ -87,6 +88,7 @@ private:
       RCLCPP_ERROR(this->get_logger(), "Goal was rejected by server");
       rclcpp::shutdown();
     } else {
+      goal_handle_ = goal_handle;
       RCLCPP_INFO(this->get_logger(), "Goal accepted by server, waiting for result");
     }
   }

this tutorial is the primary source of how the use the API and will be copy and pasted around.

totally agree.

jmachowinski commented 11 months ago

I don't know, the more I think about this API the less I like it. The problem with this future is, that you also can not cancel it, while it has not reported back yet.

We needed to workaround this problem in our code base by adding a ref counting self deleting object, that introduces a potential memory leak, in case we never get the callback.

As @clalancette pointed out, using a future here is kind of odd in the first place...

Is it an option to just deprecate the current API and make it more handle centric, without the shared pointers and the futures ?

fujitatomoya commented 5 months ago

@jmachowinski so as we talked on the call, i did revisit this and go through all comments.

IMO, we should take this fix (or https://github.com/ros2/demos/pull/649#issuecomment-1743304558 patch) although there are some areas we need to address.

https://github.com/ros2/rclcpp/blob/b50f9ab41840dfec121b7d12118da6e6593d21ae/rclcpp_action/include/rclcpp_action/client.hpp#L412-L413

this can fix the inconsistency with above, and user would copy&paste the sample, and led to unexpected behavior.

note,

The future is usually used to wait for something, not keep something alive. using a future here is kind of odd in the first place...

agree.

The problem with this future is, that you also can not cancel it, while it has not reported back yet.

agree.

CC: @clalancette @mjcarroll @alsora

jmachowinski commented 5 months ago

@fujitatomoya I updated https://github.com/ros2/rclcpp/pull/2281 to contain the nodiscard we talked about

jmachowinski commented 5 months ago

closed, as we will leave the API as it is for now.