ros / roscpp_core

ros distribution sandbox
88 stars 116 forks source link

Resolution of ros::Duration::sleep() limited by implicit sleep in nanoseconds #133

Closed YoshuaNava closed 1 year ago

YoshuaNava commented 2 years ago

Hi, By reading through the code of the rostime package I found that ros::Duration::sleep() has an internal sleep in wall time with a length of 1,000,000 nanoseconds = 1 millisecond. Code -> https://github.com/ros/roscpp_core/blob/noetic-devel/rostime/src/time.cpp#L402

As far as I understand, the internal call to ros_wallsleep(0, 1000000); is limiting the maximum resolution of ros::Duration::sleep() to 1 millisecond in wall time.

I intend to use ros::Time inside a node that spins at a rate of 400Hz (every 2.5ms), and this aspect of the current implementation would not allow me to sleep() with a fine-enough resolution. Due to the 1ms resolution, sleep would wake up at t=2ms, or t=3ms, but never t=2.5ms.

Therefore, I have two related questions:

  1. Can one of the authors/maintainers/users confirm whether my observation is true/valid?
  2. If 1 is true, would you be open to a fix/extension to allow more fine-grained sleep durations? For example, through an optional parameter on the sleep function:
bool Duration::sleep(const uint32_t min_wall_sleep_time_ns = 1000000) const;

Thank you very much in advance.

Best regards, Yoshua Nava

MatthijsBurgh commented 1 year ago

When running at such a high frequency of 400Hz, I would think WallDuration/WallTime is better.

YoshuaNava commented 1 year ago

Hi @MatthijsBurgh,

I have tried your suggestion. However, running in WallTime means that any high-rate module will not listen to the updates of ROS Time. Thus, if you slow down, speed up, or pause

The nodes' notion of time will be different from that of the sensor and actuator clocks, leading to ambiguous results (e.g. if simulation is running slower, should a filter propagate as usual, or wait for the next clock tick?). Furthermore, if you have Time-based events, they will be triggered at the wrong instant. Thu, Breaking away into a different time domain can increase the number of issues, but not solve the problem at the root.

Practical example Imagine that you have a swarm-drone system that relies on accurate, time-sensitive motion planning. If you want to test this system in simulation, with WallTime, you need to ensure that your simulator will run at exactly the same rate as your host PC Clock (1.0x, uniform, constant), otherwise your whole motion planning algorithm will have to 'fight' uncertain timing (which is a problem that you can solve in the real world with a GPS)

HandmadeColossalDrongo-size_restricted

(Image source: https://gfycat.com/handmadecolossaldrongo-hotcoffee9520)


Hence, that's why my suggestion goes into the direction of making the minimum wait time an optional argument.

WDYT?

Best, Yoshua

MatthijsBurgh commented 1 year ago

I agree with you. The WallTime doesn't slow down. I expected your process to be time critical, because of the high frequency.

I think your case with such a high frequency, but want to sync with the ROS time, is very rare. And as ROS1 is going to the end of its life, I don't expect the maintainers to change it.

YoshuaNava commented 1 year ago

What you mention makes sense :+1:. I hope that this issue's info can be useful for future travelers of the ROS1 code, or authors of ROS2.

Thank you for the feedback @MatthijsBurgh :muscle:

peci1 commented 1 year ago

I can imagine a PR along these lines could get merged: compare the intended sleep duration with the 1 ms sleep, and if one tenth of the intended duration is shorter than the 1 ms, shorten the wallsleep to this one tenth. This should not affect most normal use-cases, but should help your cause.

peci1 commented 1 year ago

@YoshuaNava #137 should help with this. Could you verify?