ros2 / design

Design documentation for ROS 2.0 effort
http://design.ros2.org/
Apache License 2.0
215 stars 194 forks source link

Events Executor design #305

Closed irobot-ros closed 6 months ago

irobot-ros commented 3 years ago

This PR presents a design for a new type of executor, with the purpose of addressing some of the long-standing issues of the current implementation. This follows the idea presented in the discourse post https://discourse.ros.org/t/ros2-middleware-change-proposal/15863

The new executor uses an events queue and a timers manager as opposed to waitsets, to efficiently execute entities with work to do.

Developed by iRobot Mauro Passerino Lenny Story Alberto Soragna


Connects to:

ros-discourse commented 3 years ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros2-middleware-change-proposal/15863/20

alsora commented 3 years ago

@ivanpauno thank you for taking the time to evaluate our proposal.

Here some replies to your comment.

I have left some comments about the event based approach, but my overall feedback is that we should figure out what's a good default executor and only maintain one in rclcpp. This doesn't mean that the one in rclcpp should be wait set based, I only mean that I would rather have one single threaded executor in rclcpp (which could be event queue based) instead of three.

If we're going to switch to a event based executor, it would be great to have more information to understand what's the real limit of the current wait set based approach (see my other comments).

Yes, the ROS client library should really provide only 1 executor that works well. The points that we mentioned in the Requirements section, should be the guideline for how a ROS 2 executor should be., Our idea is that the EventsExecutor is a step forward in that direction. The PR that we opened that implements this new executor (i.e. https://github.com/ros2/rclcpp/pull/1416) on purpose does not modify any of the existing code, but just adds new APIs. We think that this will allow to more easily merge the changes and at the same time start the process to carefully evaluating what we have with the goal to have a single, good, executor that work in all use cases for the Galactic release.

I also think that rmw/rcl API should be flexible enough so you can implement an event queue/work queue/wait set based executor out of tree, and I think that would be possible. Based on your experience, do you think it's possible to implement an event based executor out of tree? What additions would be needed on rmw/rcl to make that possible? Off the top of my head, I think that if rmw API supported attaching listeners to entities, that would make this possible.

In order to implement the EventsExecutor, we had to only add some boilerplate code in rmw/rcl layers. You can have a look at the specific PRs rcl https://github.com/ros2/rcl/pull/839 rmw https://github.com/ros2/rmw/pull/286 rmw_fastrtps https://github.com/ros2/rmw_fastrtps/pull/468

It's just the same API copy-pasted for all the entities.

  • AFAIR the StaticSingleThreadExecutor, isn't actually static. It rebuilds the wait set when an entity is added, but it uses a synchronization mechanism instead of always rebuiling it (which is much more efficient). Based on that, it sounds like the StaticSingleThreadExecutor is always a better choice than the SingleThreadExecutor so I think that replacing the later should be completely removed.

  • We also have a rclcpp::WaitSet that we don't use in any of the wait set based executors.

Yes, the StaticSingleThreadedExecutor is not really static and from our experience it always performs better than the SingleThreadedExecutor. It has some limitations though, as it does not fully support all the executor APIs and the concept of an ExecutableList which keeps ownership of the entities may cause some issues.

From a discussion I had with @wjwwood I remember that the rclcpp::WaitSet can be configured such that it will become completely static. That should in theory allow to match the performance of the StaticSingleThreadedExecutor, but it will not allow to add new entities at all.

However, from our analysis, both these solutions still cause a considerable CPU overhead with respect to running your pub/sub application without ROS and directly using your middleware of choice. On the other hand, the EventsExecutor allows to get very close to the performance of the middleware.

gbiggs commented 3 years ago

I have left some comments about the event based approach, but my overall feedback is that we should figure out what's a good default executor and only maintain one in rclcpp. This doesn't mean that the one in rclcpp should be wait set based, I only mean that I would rather have one single threaded executor in rclcpp (which could be event queue based) instead of three.

I strongly disagree with this. There is no reason that a client library should be limited to one executor. We should figure out what our use cases are and ensure that each is covered completely by one executor. If the collection of use cases requires more than one executor to completely cover, then so be it. There is also the possibility that a particular language may provide multiple ways to handle event queues and multi-threading, so one executor for each might make sense in that language's client library.

There should be one default executor, of course. When you call spin() without directly creating an executor, what executor is used inside that is an important choice we must make if we have multiple available.

wjwwood commented 3 years ago

Ok, I've been making progress on the rmw changes related to this. One big thing I did was remove the term listener from these pull requests. My reasoning being that we don't have any concept like "listener" in ros yet, and we might want it in the future, but for now, how the callbacks are called isn't really important. So I removed those in favor of names like rmw_subscription_set_on_new_message_callback rather than rmw_subscription_set_listener_callback. I'm open to feedback on this, but I think it was a good change.

I also improved the documentation and implementation for rcl.

I did a major refactor to rclcpp, making the callbacks type safe and exception safe. I did some other things too, like dropping the setter function for waitables, but I'll add more notes about that on the pr itself.

I still think we need some tests for rcl and/or rclcpp, and we need a implementation (even a dummy one) for rmw_connextdds. I'm not sure how much more I can get done tonight, but I'll at least run CI on it.

ros-discourse commented 3 years ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros2-speed/20162/9

adamdbrw commented 3 years ago

@wjwwood is this work still ongoing?

wjwwood commented 3 years ago

@adamdbrw Yes, but slowly. There are pull requests implementing needed features for some of what is included here, but they didn't make it into galactic, and have been mostly ideal since, which is my fault. I expect we'll start making progress again soon.

alsora commented 3 years ago

@adamdbrw FYI we have branches where we implemented the whole design and that we are using for our testing/development.

The PRs mentioned in the first post implement the first part of the design (i.e. RMW listener APIs). The executor itself can be found in this branch https://github.com/irobot-ros/rclcpp/tree/galactic-events-executor This branch already includes the rclcpp PR mentioned in the first post and adds the events executor on top of it

alsora commented 2 years ago

Hi, we updated the design with some of the latest changes originated from the middelware WG discussions.

timonegk commented 2 years ago

Hi, we recently ran into the performance issues in ROS2 and tried the EventsExecutor from the iRobot forks and it worked phenomenally. @wjwwood @alsora What is currently blocking the progress of merging the EventsExecutor, especially to rclcpp? As far as I can see, part of the implementation in https://github.com/ros2/rclcpp/pull/1416 was merged in https://github.com/ros2/rclcpp/pull/1579, but the executor itself was not merged and there is not even a pull request for it. As the current state of the ROS2 executors is not really usable for high frequency systems, this would be a great improvement and I / we would be glad to help move this forward wherever we can.

timonegk commented 2 years ago

FYI, before someone else also does the same thing: We really needed the executor, so I rebased the work by @irobot-ros on the current master. This was mostly rclcpp, because most of the other stuff was already merged. I also needed a small number of additional commits for rmw_fastrtps and rcl. The repositories can be found at

In addition, I created packages for ubuntu 22.04. They are available at the following release repositories:

A strange bug I experienced was that I had to revert https://github.com/ros2/rmw_fastrtps/pull/583 because an invalid qos type was falsely triggered. I don't know why that happened but will investigate.

I hope this helps someone who also needs the executor, or it helps you when it will be merged.

alsora commented 2 years ago

We plan to create soon an "events executor" repository that only contains the executor and it's compatible with standard rclcpp library.

This will allow ROS 2 Humble users to use the events executor in an easy way.

In the meantime, we will also open new PRs to rclcpp to merge this executor in the mainline.

Flova commented 2 years ago

@alsora any updates on the executor repo? :)

mauropasse commented 2 years ago

@timonegk this PR https://github.com/ros2/rmw_fastrtps/pull/625 fixes the issue you got about an invalid qos type falsely triggered. So you don't have to revert the FastRTPS PR https://github.com/ros2/rmw_fastrtps/pull/583

timonegk commented 2 years ago

@mauropasse Thanks, works great!

Aposhian commented 1 year ago

What are the next steps for this? Is this something that needs to be brought up in the middleware WG?

alsora commented 1 year ago

Hi, thank you everyone for your interest in the events-executor. You can find here a Github repository containing an implementation of the executor for ROS 2 Humble: https://github.com/irobot-ros/events-executor

We plan to release this as a debian package as soon as the known limitations and bugs mentioned in the readme are addressed. Contributions and feedbacks are highly appreciated!

ijnek commented 1 year ago

@alsora Thanks for sharing you and your teams amazing work! Is the plan to get this into rclcpp, or to keep it separate in this repo?

Once the debians are out, I can imagine people getting confused about having to add a dependency on irobot_events_executor even though the way the events_executor should be included:

#include "rclcpp/executors/events_executor/events_executor.hpp"

makes it look like its from rclcpp.

alsora commented 1 year ago

Yes the plan is definitely to merge this into rclcpp, the separate repo and the debian package are just intermediate solutions to let people play with it in the meantime.

It's now almost 2 years that we have a working implementation of the events executor (we first opened a rclcpp PR with it in 2020), but it may take a while before all the pieces and requirements are merged.

The reason for having the same paths as rclcpp was to simplify the transition, i.e. we could copy-paste the files into a PR for rclcpp and people that were using the executor from the separate repo wouldn't have to make any change to their source code. However I see the point for the potential confusion.

We can probably just change the top-level directory from rclcpp to irobot_rclcpp.

P.S. if you have additional feedbacks and suggestions, feel free to open a ticket on the events executor repository or code PRs

alsora commented 1 year ago

@wjwwood @clalancette @ivanpauno this PR is open since almost 2 years. Anything to review/modify before merging it?

sandeepsuresh commented 1 year ago

As i understand this Event Executor is a Single threaded executor. Is there an update to this to support multithreaded EventExecutor with Callback Groups? Please revert back.

alsora commented 1 year ago

The events executor by design can either be single thread or multi-thread.

The implementation currently available at https://github.com/irobot-ros/events-executor support two modes:

Contributions are welcome to support an arbitrary number of threads.

ros-discourse commented 1 year ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/questions-about-the-intrinsic-acquisition-of-osrc/28763/15

russkel commented 1 year ago

So... LGTM?

wjwwood commented 1 year ago

We've recently had a discussion about what to do with this pr, and it's possible we can merge it and immediately mark it as historical. There are a few issues at play:

alsora commented 5 months ago

Hi, something strange happened to this PR. It was merged last month, but apparently it doesn't include any changes