ros2 / rmw_zenoh

RMW for ROS 2 using Zenoh as the middleware
Apache License 2.0
144 stars 29 forks source link

Timer callback stops firing when no data is moving through the system #158

Open AlexDayCRL opened 2 months ago

AlexDayCRL commented 2 months ago

Hi,

I am having troubles with my timer callback calling after a few seconds when using the zenoh middleware with my application. I am not experiencing this issue with cyclonedds. I have tried and failed to reproduce it with a small example. Is there any way to introspect the timer queue within Zenoh?

clalancette commented 2 months ago

So it is tricky because rmw_zenoh doesn't directly deal with the timer callbacks. Those are handled at the rcl/rclcpp layer. However, if the rmw_wait that is implemented in Zenoh gets hung up for some reason, it can cause the timer callbacks to be delayed. They should eventually fire (after rmw_wait times out), but they may be delayed.

It's hard to say here without an example, but there have been some other issues we've been trying to track down with rmw_wait. #153 should have solved some of it, but seems to have caused a regression. If you apply #157, does the problem go away?

AlexDayCRL commented 2 months ago

@clalancette I still have the same issues on #157. From your comments it seems that I would want to instrument the rmw_wait function if I wanted to drill down into what was happening though.

MichaelOrlov commented 2 months ago

Moving to the backlog for now since other relevant issues #153 and #157 haven't been resolved yet.

AlexDayCRL commented 2 months ago

This issue also only happens when my node has multiple timers.

Yadunund commented 1 month ago

Hey @AlexDayCRL could you try doing a clean build with the latest rolling branch and confirm if the problem still exists?