ros2 / rmw_connextdds

ROS 2 RMW layer for RTI Connext DDS Professional and RTI Connext DDS Micro.
Apache License 2.0
48 stars 33 forks source link

Add ros2_tracing tracepoints #120

Closed cwecht closed 8 months ago

cwecht commented 1 year ago

Depends on https://github.com/ros2/ros2_tracing/pull/74; implements: https://github.com/ros2/ros2_tracing/issues/73

This PR introduces ros2_tracing tracepoints which are really helpful for examining system performance. The dependency to the ros2_tracing PR stems from the fact, that one of the original tracepoints had to be extended in order to get the timestamp during writing to DDS.

In order to do that, DDS_DataWriter_write_untypedI had to be replaced with DDS_DataWriter_write_w_timestamp_untypedI. At first it seemed reasonable to use` in order to get the respective timestamp.

This seems to work, but DDS_DomainParticipant_get_current_time introduces a massive performance penality. The flamegraph below shows the result of profiling it using perf. grafik

From https://community.rti.com/kb/how-clocks-are-used-connext-dds I'd conclude, that we need the "external clock" used by Connext here. Looking at the perf profile, it seems that there is a method/function DDS_DomainParticipant_get_external_clockI which would provide to us said clock, but it seems that this function is not part of Connexts public API right? Is there any way to get this clock or the timestamp which Connext would use if DDS_DataWriter_write_untypedI is used without paying such a massive performance penalty?

asorbini commented 1 year ago

From https://community.rti.com/kb/how-clocks-are-used-connext-dds I'd conclude, that we need the "external clock" used by Connext here. Looking at the perf profile, it seems that there is a method/function DDS_DomainParticipant_get_external_clockI which would provide to us said clock, but it seems that this function is not part of Connexts public API right? Is there any way to get this clock or the timestamp which Connext would use if DDS_DataWriter_write_untypedI is used without paying such a massive performance penalty?

I'm not very familiar with the clock configuration in Connext, and I'm looking into possible alternative solutions. I'll get back after I have a better idea of available options.

cwecht commented 1 year ago

@asorbini Thank you for the review! I just applied the changes you requested and (forced) pushed them.

asorbini commented 1 year ago

@cwecht are you sure you pushed the change? The code looks unchanged.

In general, I would prefer if you didn't force push but rather pushed an incremental commit so that the additional changes can be reviewed without having to re-scan through all of the code.

One reason to force push in this case would be to fix the commit message, which should include a DCO (e.g. "Signed-off-by: Your Name your@email"). I guess we are not strictly enforcing this rule and we are missing a CONTRIBUTING document in this repository (like this one), but the convention has been followed by all contributions, as far as I'm aware.

cwecht commented 1 year ago

@asorbini now it's actually pushed including the signoff.

cwecht commented 1 year ago

I've got some news regarding DDS_DomainParticipant_get_current_time.

I was able to test this PR with connext micro and using DDS_DomainParticipant_get_current_time in this case is just fine; there is basically no significant performance overhead.

For Connext PRO it is a different story:

  1. The traces seem to indicate that the majority of the time in DDS_DomainParticipant_get_current_time is spend looking up the clock which is configured in the respective QoS settings. The documentation seems to indicate that this has to be done since these QoS settings may change at any time which makes it reasonable to look up the clock which is supposed to be used every time.
  2. However, if I understand the current implementation in rmw_connextdds correctly, the clock in use will always be the default clock, as PROPERY attribute of DDS_DomainParticipantQos is never changed. From my understanding this can't even be changed via e.g. XML.
  3. If the above is correct, we should be fine with using the system clock here directly as this should be always the clock which is used by DDS_DomainParticipant_get_current_time by default.

@asorbini could you double check this conclusion?

cwecht commented 1 year ago

@asorbini I've added now rmw_connextdds_get_current_time which encapsulates the distinction between Pro and Micro regarding time retrieval. The performance difference between Pro's DDS_DomainParticipant_get_current_time and this one based on RTIOsapiUtility_getTime is quite impressive: RTIOsapiUtility_getTime takes usually a few hunders of nanoseconds where DDS_DomainParticipant_get_current_time takes usually more like 6-20 microseconds. Do you have any objections regarding this approach?

cwecht commented 10 months ago

@asorbini May I ask you kindly to have another look at this PR? Thank you very much!

asorbini commented 9 months ago

Running full CI with repos for this PR, ros2/rmw_cyclonedds#454, ros2/rmw_fastrtps#694, ros2/ros2_tracing#74:

ETA: Build failed because of typo in dds_api.hpp.

asorbini commented 9 months ago

Running CI again:

cwecht commented 9 months ago

I'm note quite sure why the latest build failed. Looks more like a jenkins hiccup to me.

asorbini commented 9 months ago

@cwecht agreed, seems like an unrelated failure. Running Linux again to check: Build Status.

The other two builds are yellow because of a formatting error unrelated to this PR.

clalancette commented 9 months ago

The other two builds are yellow because of a formatting error unrelated to this PR.

I will point out that the nightlies have no issues: https://ci.ros2.org/view/nightly/job/nightly_linux-aarch64_release/2582/ . So it is worth investigating those before merging.

asorbini commented 9 months ago

@clalancette thank you for the clarification. I totally agree that those issues should be resolved, but they are C++ linter errors and they stem from a different PR (ros2/rmw_cyclonedds#454), hence why I didn't think they'd affect this PR.