ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
514 stars 410 forks source link

Release ownership of entities after spinning cancelled #2556

Open Barry-Xu-2018 opened 3 weeks ago

Barry-Xu-2018 commented 3 weeks ago

This issue was found in ros2/rmw_fastrtps#761.
In code of MoveIt, it used static executor. When the static executor is destructed (The executor is in a cancel state.), and it's time to free up the ownership of entity resources, Fast DDS resources have already been released.

After spinning is cancelled, the Executor should release ownership of entities at once.

alsora commented 2 weeks ago

can we deprecate / remove the static executor rather than keep trying to fix it? i don't see any reason why users should keep using it

wjwwood commented 2 weeks ago

Might be a misunderstanding, I thought they meant the StaticSingleThreadedExecutor too, but they mean they had a static-global executor, so it was destructed after the (also static-global) domain participant factory in fastrtps. At least that's my understanding.

So this affects all of the executors that use this part of the executor base class.

wjwwood commented 2 weeks ago

But yes, we can deprecate it in rolling at least, though that should be a separate pr.

Barry-Xu-2018 commented 2 weeks ago

Might be a misunderstanding, I thought they meant the StaticSingleThreadedExecutor too, but they mean they had a static-global executor, so it was destructed after the (also static-global) domain participant factory in fastrtps. At least that's my understanding.

Your understanding are right. Sorry, my description static executor caused a misunderstanding.

Barry-Xu-2018 commented 2 weeks ago

@wjwwood

How about moving release action to every exit point in different spin functions ? https://github.com/ros2/rclcpp/pull/2556/commits/0b4ab7eb6ae278030c4e4b0b98438933a272a2b0

fujitatomoya commented 2 weeks ago

Might be a misunderstanding, I thought they meant the StaticSingleThreadedExecutor too, but they mean they had a static-global executor, so it was destructed after the (also static-global) domain participant factory in fastrtps. At least that's my understanding.

i thought that too until looking into the reproducible program.

So this affects all of the executors that use this part of the executor base class.

true. actually they uses SingleThreadedExecutor, https://github.com/moveit/moveit_task_constructor/blob/bb047c894cd2a395a672abdd47ec76d709d1cb1e/core/src/introspection.cpp#L160

wjwwood commented 2 weeks ago

@Barry-Xu-2018 that was kind of what I was thinking, but we should take care to clear it before setting the spinning to false, so that a new spin doesn't enter before it can be reset.

Barry-Xu-2018 commented 2 weeks ago

The failure of Rprrclcppubuntu_noble_amd64 is related to test create_two_content_filters_with_same_topic_name_and_destroy. rmw_fastrtps in CI environment doesn't include https://github.com/ros2/rmw_fastrtps/pull/762.

Barry-Xu-2018 commented 2 weeks ago

@wjwwood @alsora I have updated code. Please review again.

Barry-Xu-2018 commented 6 hours ago

Rebase was done.

Barry-Xu-2018 commented 53 minutes ago

@alsora I updated code. Please help to review again.