ros2 / rcutils

Common C functions and data structures used in ROS 2
Apache License 2.0
58 stars 100 forks source link

Creating More Accuracy in NS_TO_S #460

Open CursedRock17 opened 6 months ago

CursedRock17 commented 6 months ago

This Pull Request is done in preparation for this geometry2 issue which reutilizes logic increasing accuracy in the conversion from nanoseconds to seconds. I assumed we wanted to keep these as Macros opposed to creating functions, but this would mean the Macros continue to take multiple "types" when we just want nanoseconds (int64_t). So should they stay as macros and we just remove all tests which don't take an int64 type and begin enforcing this behavior?

clalancette commented 6 months ago

I think that before we merge this, we should have a companion PR in geometry2 (or somewhere else) that actually uses it. Otherwise, we can run into the situation where we have an API that nothing uses, which isn't great.

ahcorde commented 6 months ago
CursedRock17 commented 6 months ago

I'm going to roll out a fix soon, I assume the MACRO is returning long instead of a double, but it's weird that it's not throwing an error on my local tests, which require doubles as results. Also, is there something I missing with the CI having semi-random K's in the error message?

[K/home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/qos.cpp:145:erroinvalid operands of typedou’ anlong ’ to binaroperat’

Edit: It seems like the error is not on our part, but rather rmw_fastrtps. This call to RCUTILS_NS_TO_S is causing the problem where it calls incorrect types between double to long, because the nanoseconds that are being passed, are in the form of a double already. The line before is:

double period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2.0 / 3.0;

Which converts lease_duration to a nanosecond which is, throughout the entire ecosystem, given to be an int64_t, seen here. I guess now it can't handle an operation between double and long, but if the user uses a long long (int64_t), the compiler can handle the implicit conversion. So should this be brought up to rmw_fastrtps, because they may be doing excess conversion on their qos values? Also, this shouldn't cause any other issues on a widespread basis, because the user is returning a double from this function, converting a double to another double doesn't make logistical sense IMO.

clalancette commented 6 months ago

It seems like the error is not on our part, but rather rmw_fastrtps. This call to RCUTILS_NS_TO_S is causing the problem where it calls incorrect types between double to long, because the nanoseconds that are being passed, are in the form of a double already. The line before is:

double period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2.0 / 3.0;

Agreed with your analysis; that line of code should probably be:

int64_t period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2 / 3;

(though there is some risk of losing a bit of precision there).

If you'd like to open a PR over at rmw_fastrtps to change that, we can review it over there and unblock this.

CursedRock17 commented 3 months ago

In response to this comment I went with the first option of keeping everything within rcutils as to broaden these new functions to as many repositories as possible as and continue to solve the original issue. I went ahead and made functions of all the macros that magnify the time as that's where most of our lost precision comes from, I felt the other 3 macros don't necessarily need functions of the their own (more bloat in the file).

As I was breaking down the tests I would run into oddball tests in which a value would be too large for the original uint32_t conversion, but after removing the casts to said type, all of my tests were passing with the correct values to pass. After digging the original logic comes from here with no way to find out why said conversions were used. Just wanted to know if my analysis is wrong here