ros2 / rmw_zenoh

RMW for ROS 2 using Zenoh as the middleware
Apache License 2.0
184 stars 34 forks source link

Switch to custom logging macros to avoid deadlocks #223

Closed Yadunund closed 3 months ago

Yadunund commented 3 months ago

Address logging related deadlock discovered in https://github.com/ros2/rmw_zenoh/issues/182.

@clalancette I went with a more C++ implementation here by relying on the popular fmt library which is apparently more performant than printf and ostringstream. Let me know if you prefer following rcutils/logging.h type of implementation with a specifiable allocator. We could also directly invoke rcutils_logging_console_output_handler.

clalancette commented 3 months ago

@clalancette I went with a more C++ implementation here by relying on the popular fmt library which is apparently more performant than printf and ostringstream

The problem with bringing in this dependency is that we'll have to deal with it on Windows. Also, in general, the more dependencies this has, the harder it will be to get rmw_zenoh into the core. Finally, my understanding is that either C++-20 or C++-23 will have this directly, so we can get this "for free" when we upgrade in the future. For all of those reasons, I think we should stick with printf-style logging for now.

We could also directly invoke rcutils_logging_console_output_handler.

That's a really good idea, I'd prefer that. While it won't get us all of the features of the rcutils logger, it will at least move us in that direction.

Yadunund commented 3 months ago

Thanks for the quick feedback! I'll update to rely on rcutils functions.

Yadunund commented 3 months ago

@clalancette I've updated accordingly.

I've ran the problematic case from #182 several times and it works most of the time*.

# terminal 1
ros2 run rclcpp_components component_container
# terminal 2
for i in {1..100}; do echo "ros2 component load --node-namespace /talker${i} /ComponentManager composition composition::Talker"; done | parallel

* once in a while, the command in terminal 2 does not exit cleanly. However running ros2 node list | wc -l returns 101 indicating that 100 nodes were indeed injected into the component container. All topics are also available. So perhaps the issue there is related to shutdown?