ros2-rust / ros2_rust

Rust bindings for ROS 2
Apache License 2.0
961 stars 129 forks source link

Remove the Mutex around the node handle? #144

Open nnmm opened 2 years ago

nnmm commented 2 years ago

The only functions that take a mutable pointer, i.e. rcl_node_t *, are rcl_publisher_fini and rcl_subscription_fini. And if you look at them, they don't really need a mutable pointer, since they only call the const functions rcl_node_is_valid_except_context() and rcl_node_get_rmw_handle(). We should find out why they, apparently unnecessarily, take a mutable pointer.

So I think (though I'm only 95% sure) that we could get by with casting a const pointer to a mut pointer just before we call these functions, and remove the Mutex around the node handle everywhere. As far as I can tell, rclcpp also doesn't have a lock for accessing the rcl node struct.

The other issue with that is that the functions are not all thread-safe. We could add back more fine-grained locks to solve this, which would still be a performance improvement.

What's not clear to me is, for instance, do we only need to make sure rcl_publisher_init is not called concurrently with itself, or also some or all other rcl functions that take a node and are marked as not thread-safe?

nnmm commented 2 years ago

What might be interesting too is that Node implements !Sync and !Send, so Rust might not even allow you to do anything concurrent with a Node. Not super sure, I haven't yet needed to interact with these traits directly. We should also find out why that is.

nnmm commented 2 years ago

The Sync/Send issue has been resolved in https://github.com/ros2-rust/ros2_rust/pull/171.

In the chat, @mxgrey said about the issue of mutexes

[I heard that] an expectation for RMW implementations is that all the typical node-related operations (e.g. publishing, subscribing, creating timers, etc) are supposed to be threadsafe, so the RMW implementation should be using a mutex internally anyway. I don't know if we'd want to rely on that to squeeze out just a tiny bit more performance or if we'd rather rely on the Rust layer to ensure as much safety as possible. I've definitely come across implementation errors in some RMWs when multi-threading is involved.

So, just keeping the mutex to be safe is probably fine for now. Though, rclcpp apparently doesn't protect the node with a mutex.