IntelRealSense / realsense-ros

ROS Wrapper for Intel(R) RealSense(TM) Cameras
http://wiki.ros.org/RealSense
Apache License 2.0
2.51k stars 1.74k forks source link

Clarification of Timestamp Domain Mismatch in BaseRealSenseNode::imu_callback_sync and Possible Remaining Bug #2048

Closed drewjel closed 2 years ago

drewjel commented 3 years ago

Hi,

Background: I'm a PhD student in computer science at the University of Virginia working on preventing and detecting bugs in cyber-physical systems code, and in robotics code, in particular. We've been developing some new technique and seek confirmation from the RealSense developer community that we've found a previously unrecognized bug in code that was heavily scrutinized previously. See commit #1684 and issues #1370, #1665, #1581. In the process of studying the bug discussed there, we believe we found another, which we describe next.

The issue addresses the way IMU message timestamps are adjusted in the client from their original rs2::frame values in BaseRealSenseNode::imu_callback_sync.

Firstly, I wanted to confirm the following subtract ion is where the timestamp domains can get "mixed", as discussed in the issue postings. image Specifically, although the #1684 references the "addition" on the second line, we believe that "frame.get_timestamp()" and "_camera_time_base" CAN come from different timestamp domains. For example, if frame.get_timestamp() has timestamp domain "RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME" and _camera_time_base has timestamp domain "RS2_TIMESTAMP_DOMAIN_HARDWARE_CLOCK" (which starts at 0 at device start up time) , then _ros_time_base.toSec() + elapsed_camera_ms will produce a timestamp roughly 2x of actual (as reported in #1665). Similarly, if one timestamp was in RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME and the other was in RS2_TIMESTAMP_DOMAIN_GLOBAL_TIME, then the timestamp might be ~a few hundred milliseconds off. Can you confirm that we're correct that the subtraction itself is not consistent with respect to frames of reference.

Secondly, I noticed the user in #1665 mentioned that they were getting tf::ExtrapolationError's due to timestamps in the future when global_time was set to false - so presumably the timestamps would be coming with a HARDWARE or SYSTEM domain. This was not resolved, but our best guess is that the camera clock was running slightly faster than the ROS client, leading to elapsed_camera_ms being calculated in terms of the camera's clock and ending up slightly larger than the time that would have passed according to the ROS client. Is this a correct interpretation?

RealSenseSupport commented 3 years ago

@drewjel Do you still see this issue with https://github.com/IntelRealSense/realsense-ros/pull/1684 implemented? Looking forward to your update. Thanks!

RealSenseSupport commented 2 years ago

@drewjel Any update from your side? Please note that the issue will be closed if we still don't get update from you for another 7 days. Thanks!

drewjel commented 2 years ago

Hi, apologies for the late response and thank you for yours.

The residual issue I was concerned about is described below (I did notice a few other users complain):

image

Two of the lines below concerned me: image Looking at lines 1730 and 1731, it looks like elapsed_camera_ms comes from a different clock than ros_time_base, which means it could be running at a different rate. elapsed_camera_ms is the difference between two time points on the hardware clock of the camera (RS2_TIMESTAMP_DOMAIN_HARDWARE_CLOCK), whereas ros_time_base is a reading of the local UTC time on whatever machine is actually running the code.

To give a concrete example of where this can go wrong, suppose of the hardware clock is running slightly faster than the ROS clock, then elapsed_camera_ms might evaluate to ~101 ms, but only ~100 ms have passed on the local ROS node's UTC clock, so adding elapsed_camera_ms to ros_time_base could yield a value in the future.

I think a fix would involve doing something like what global time does, you'd need to map this difference back into the local UTC clock.

Next, looking at line 1735, I was wondering what would happen if the timestamp domain is "system", i.e., the camera getting a UTC timestamp from its OS. It looks like there's a risk of there being some drift between the camera's timestamps and the ROS client's timestamps (assuming they're not using chrony or a similar tool). If the camera's UTC clock is ahead of the ROS client at any time, the calculation on 1735 could produce a timestamp that's in the future to the ROS client.

This general fault, assuming I am correct, existed in the original code, so I don't mean to imply it was introduced in #1684.

Hopefully my understanding is correct, but if I'm missing anything major, please let me know. If I am not, I hope this may be of some use!

RealSenseSupport commented 2 years ago

@drewjel I think your understanding is correct. However we don't have better solution yet. Did you get chance to try global timestamp enabled to see if this works better for you? Looking forward to your update. Thanks!

RealSenseSupport commented 2 years ago

@drewjel Any update from your side? Please note that the ticket will be closed if we don't hear from you for another 7 days. Thanks!

RealSenseSupport commented 2 years ago

@drewjel Sorry we don't hear from you for weeks. This issue will be closed. Please feel free to create new ticket if you still have any other questions or issues. Thanks!