ros2 / rmw_cyclonedds

ROS 2 RMW layer for Eclipse Cyclone DDS
Apache License 2.0
111 stars 91 forks source link

arm64-specific bug with invalid node handles #273

Open JWhitleyWork opened 3 years ago

JWhitleyWork commented 3 years ago

Bug report

Required Info:

Steps to reproduce issue

$ export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
$ git clone https://gitlab.com/autowarefoundation/autoware.auto/AutowareAuto.git
$ cd AutowareAuto
$ colcon build --packages-up-to state_estimation_node
$ colcon test --packages-select state_estimation_node
$ colcon test-result --verbose

Expected behavior

Tests pass.

Actual behavior

First test fails with the following error on Dashing:

- test_state_estimation_node_bad.test.py
  <<< failure message
    -- run_test.py: invoking following command in '/home/whitleysoftwareservices/autoware.auto.dashing/src/prediction/state_estimation_node':
     - ros2 test /home/whitleysoftwareservices/autoware.auto.dashing/src/prediction/state_estimation_node/test/state_estimation_node_bad.test.py --junit-xml=/home/whitleysoftwareservices/autoware.auto.dashing/build/state_estimation_node/test_results/state_estimation_node/test_state_estimation_node_bad.test.py.xunit.xml --package-name=state_estimation_node
    [INFO] [launch]: All log files can be found below /home/whitleysoftwareservices/.ros/log/2020-12-18-12-30-02-235681-ade-12015
    [INFO] [launch]: Default logging verbosity is set to INFO
    1608316203.982884 [1]       ros2: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter
    [ERROR] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create DDS participant

    >>> [rcutils|error_handling.c:106] rcutils_set_error_state()
    This error state is being overwritten:

      'error not set, at /tmp/binarydeb/ros-dashing-rcl-0.7.9/src/rcl/node.c:326'

    with this new error message:

      'rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-dashing-rcl-0.7.9/src/rcl/node.c:464'

    rcutils_reset_error() should be called after error handling to avoid this.
    <<<
    [ERROR] [rcl]: Failed to fini publisher for node: 1
    [ERROR] [launch]: Caught exception in launch (see debug for traceback): Unknown error creating node: rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-dashing-rcl-0.7.9/src/rcl/node.c:464
    Processes under test stopped before tests completed
    -- run_test.py: return code 1
    -- run_test.py: verify result file '/home/whitleysoftwareservices/autoware.auto.dashing/build/state_estimation_node/test_results/state_estimation_node/test_state_estimation_node_bad.test.py.xunit.xml'
  >>>

And the following on Foxy:

- test_state_estimation_node_bad.test.py
  <<< failure message
    -- run_test.py: invoking following command in '/home/whitleysoftwareservices/autoware.auto.foxy/build/state_estimation_node':
     - ros2 test /home/whitleysoftwareservices/autoware.auto.foxy/src/prediction/state_estimation_node/test/state_estimation_node_bad.test.py --junit-xml=/home/whitleysoftwareservices/autoware.auto.foxy/build/state_estimation_node/test_results/state_estimation_node/test_state_estimation_node_bad.test.py.xunit.xml --package-name=state_estimation_node
    Running with ROS_DOMAIN_ID 1
    ROS_DOMAIN_ID 1
    [INFO] [launch]: All log files can be found below /home/whitleysoftwareservices/.ros/log/2020-12-18-12-49-08-678370-ade-4228
    [INFO] [launch]: Default logging verbosity is set to INFO
    Warning: Passing ready_fn as an argument to generate_test_description will be removed in a future release.  Include a launch_testing.actions.ReadyToTest action in the LaunchDescription instead.
    Warning: The parameter 'node_executable' is deprecated, use 'executable' instead
    Warning: The parameter 'node_name' is deprecated, use 'name' instead
    Warning: The parameter 'node_namespace' is deprecated, use 'namespace' instead
    test_wait_for_exit (state_estimation_node.TestWaitForShutdown) ... [INFO] [state_estimation_node_exe-1]: process started with pid [4235]
    [INFO] [python3-2]: process started with pid [4239]
    [state_estimation_node_exe-1] 1608317348.993236 [1] state_esti: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter
    [state_estimation_node_exe-1] [ERROR] [1608317348.996267111] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error
    [state_estimation_node_exe-1] 
    [state_estimation_node_exe-1] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
    [state_estimation_node_exe-1] This error state is being overwritten:
    [state_estimation_node_exe-1] 
    [state_estimation_node_exe-1]   'error not set, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:276'
    [state_estimation_node_exe-1] 
    [state_estimation_node_exe-1] with this new error message:
    [state_estimation_node_exe-1] 
    [state_estimation_node_exe-1]   'rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428'
    [state_estimation_node_exe-1] 
    [state_estimation_node_exe-1] rcutils_reset_error() should be called after error handling to avoid this.
    [state_estimation_node_exe-1] <<<
    [state_estimation_node_exe-1] [ERROR] [1608317348.998946065] [rcl]: Failed to fini publisher for node: 1
    [state_estimation_node_exe-1] terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
    [state_estimation_node_exe-1]   what():  failed to initialize rcl node: rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428
    [state_estimation_node_exe-1] qemu: uncaught target signal 6 (Aborted) - core dumped
    [ERROR] [state_estimation_node_exe-1]: process has died [pid 4235, exit code -6, cmd '/home/whitleysoftwareservices/autoware.auto.foxy/install/state_estimation_node/lib/state_estimation_node/state_estimation_node_exe --ros-args -r __node:=state_estimation_node -r __node:=state_estimation_node -r __ns:=/state_estimation_namespace --params-file /home/whitleysoftwareservices/autoware.auto.foxy/install/state_estimation_node/share/state_estimation_node/param/state_estimation_node_test.param.yaml --params-file /tmp/launch_params_i3q9vv5z'].
    [INFO] [python3-2]: sending signal 'SIGINT' to process[python3-2]
    [INFO] [python3-2]: process has finished cleanly [pid 4239]
    ok

    ----------------------------------------------------------------------
    Ran 1 test in 0.254s

    OK
    test_exit_code (state_estimation_node.TestProcessOutput) ... FAIL

    ======================================================================
    FAIL: test_exit_code (state_estimation_node.TestProcessOutput)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/whitleysoftwareservices/autoware.auto.foxy/src/prediction/state_estimation_node/test/state_estimation_node_bad.test.py", line 76, in test_exit_code
        self.assertIn(ref_stderr, full_stderr)
    AssertionError: "what():  Please provide either 'output_frequency' setting or 'data_driven' one, not both." not found in "1608317348.993236 [1] state_esti: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter\n[ERROR] [1608317348.996267111] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error\n\n>>> [rcutils|error_handling.c:108] rcutils_set_error_state()\nThis error state is being overwritten:\n\n  'error not set, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:276'\n\nwith this new error message:\n\n  'rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428'\n\nrcutils_reset_error() should be called after error handling to avoid this.\n<<<\n[ERROR] [1608317348.998946065] [rcl]: Failed to fini publisher for node: 1\nterminate called after throwing an instance of 'rclcpp::exceptions::RCLError'\n  what():  failed to initialize rcl node: rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428\nqemu: uncaught target signal 6 (Aborted) - core dumped\n"

    ----------------------------------------------------------------------
    Ran 1 test in 0.007s

    FAILED (failures=1)
    -- run_test.py: return code 1
    -- run_test.py: verify result file '/home/whitleysoftwareservices/autoware.auto.foxy/build/state_estimation_node/test_results/state_estimation_node/test_state_estimation_node_bad.test.py.xunit.xml'
  >>>
build/state_estimation_node/test_results/state_estimation_node/test_state_estimation_node_bad.test.py.xunit.xml: 2 tests, 0 errors, 1 failure, 0 skipped
- state_estimation_node.TestProcessOutput test_exit_code
  <<< failure message
    Traceback (most recent call last):
      File "/home/whitleysoftwareservices/autoware.auto.foxy/src/prediction/state_estimation_node/test/state_estimation_node_bad.test.py", line 76, in test_exit_code
        self.assertIn(ref_stderr, full_stderr)
    AssertionError: "what():  Please provide either 'output_frequency' setting or 'data_driven' one, not both." not found in "1608317348.993236 [1] state_esti: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter\n[ERROR] [1608317348.996267111] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error\n\n>>> [rcutils|error_handling.c:108] rcutils_set_error_state()\nThis error state is being overwritten:\n\n  'error not set, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:276'\n\nwith this new error message:\n\n  'rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428'\n\nrcutils_reset_error() should be called after error handling to avoid this.\n<<<\n[ERROR] [1608317348.998946065] [rcl]: Failed to fini publisher for node: 1\nterminate called after throwing an instance of 'rclcpp::exceptions::RCLError'\n  what():  failed to initialize rcl node: rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-foxy-rcl-1.1.9/src/rcl/node.c:428\nqemu: uncaught target signal 6 (Aborted) - core dumped\n"
  >>>

Additional information

This has been tested both on arm64-native hardware and using arm64 binary translation with QEMU. The test fails consistently in both Dashing and Foxy when using QEMU and occasionally on arm64-native hardware depending on the system load indicating that this is timing-dependent. Using rmw_fastrtps_cpp, this failure does not occur. We have also only seen this error on this one test indicating it's a pretty corner-case scenario.

For some context to aide in troubleshooting, this test intentionally provides the state_estimation_node with conflicting parameters data_driven: true and output_frequency: 30.0. Using the free function time_between_publish_requests in src/prediction/state_estimation_node/src/state_estimation_node.cpp with both of these parameters should throw a std::logic_error on line 74 before any publishers/subscribers are created but after the node has been constructed/initialized. In Dashing, there is no indication that this exception is ever thrown but Foxy does show the output with the correct exception, only after the unexpected rcl error.

eboasson commented 3 years ago

I'd say this:

    [state_estimation_node_exe-1] 1608317348.993236 [1] state_esti: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter

is the weird bit: then the DDS domain doesn't start, the participant can't be created, nor can the node be created, &c. So one would thing everything else is collateral damage. That error really means:

  return ddsrt_setsockopt (sock, IPPROTO_IP, IP_MULTICAST_IF, gv->ownloc.address + 12, 4);

failed, or really

  if (setsockopt(sock, level, optname, optval, optlen) == -1) {
    goto err_setsockopt;
  }

Unfortunately, many errors get mapped to BAD_PARAMETER, so we don't know exactly what it is. However, sock must be ok or it wouldn't have made it to this point. So it is most likely that the kernel is unhappy with the address.

There are some things that should be fairly simple to do and that ought to provide a bit more information:

I suspect that will show it always tries to set the multicast interface to use for outgoing packets, also when the socket isn't used for transmitting and even when the interface doesn't support multicast. Perhaps that is what the kernel doesn't like ... The good news is that the fix will be pretty straightforward if this is the case.

eboasson commented 3 years ago

You might want to give this:

https://github.com/eboasson/cyclonedds/tree/mc-options-only-when-needed

a try. It is based on the possibility that you are not actually using multicast and that there is therefore no need to set this particular option. Otherwise, I really need to know a bit more about what it actually passes into the kernel and which network interfaces exist (like I mentioned). I have no reason to believe the interface numbers Cyclone uses are incorrect, but I've been caught by stupid bugs before.

As another data point, it doesn't happen on Ubuntu 20.04 on qemu on macOS on an M1 ... that's not emulation, of course, but still emulated devices.

JWhitleyWork commented 3 years ago

@eboasson Do you know how to easily replace the installed CycloneDDS installation with this one, built from source? Just adding the repository from my workspace, colcon doesn't seem to recognize that it's a deep-down dependency when building with --packages-up-to.

eboasson commented 3 years ago

@JWhitleyWork I used to do --packages-up-to rmw_cyclonedds_cpp but for rolling/galatic that's no longer needed. The easiest should be to just do colcon build --packages-select cyclonedds, all you need is the new libddsc.so.0.7.0 (or whatever the number is).

What version of rmw_cyclonedds_cpp are you using, by the way? Because the RMW layer uses some things that aren't part of the stable API to paper over differences between ROS2 and DDS and some changes happened that are binary compatible but not quite source compatible. The patch I did should cherry-pick cleanly to just about any version of Cyclone if needed.

jpace121 commented 2 years ago

I'm seeing a similar error message (ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter) when trying to use a publisher in a arm64 Ubuntu container that's running on a 64-bit Debian 11 host using qemu-user-static for the emulation.

I was able to replicate the error with a DDS publisher without ros 2 (using cyclonedds-cxx). I tried applying the commit @eboasson linked to in his branch above to cyclonedds, but the commit didn't apply cleanly to the latest version of cyclonedds.

eboasson commented 2 years ago

@jpace121, I rebased https://github.com/eboasson/cyclonedds/tree/mc-options-only-when-needed and would appreciate it very much if you gave it another go.

jpace121 commented 2 years ago

That didn't seem to fix it. With that being said, I noticed tonight that I'm also getting the following warning on stdout:

qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)

so I'm guessing my bug may be caused by something going on in qemu, not cyclone.

Thanks for the assistance.

eboasson commented 2 years ago

@jpace121 I imagine any qemu crash would be rather serious, I'm sure they'd like to hear of it.

Anyway, the point of that branch was to avoid doing setsockopt(... IP_MULTICAST_IF ...) when multicast is not used. One reason was that it seemed plausible (however remotely) that it might give an error if multicast wasn't enabled somewhere in the kernel or below; the other reason is that it should at least give a workaround by disabling multicast in Cyclone.

One thing I have wondered about is what it is actually passing into the kernel. As these calls happen really early on in the process and it sounds like it is fairly easily reproduced (especially now you don't even need to involve ROS), perhaps it would be possible to log the system calls with strace -v (I think that's the option that gives all the gory details; possibly -e network would be sensible too). That would at least give us some more understanding of what goes on when it fails.

tfoote commented 2 years ago

I can reproduce this in a much shorter invocation using the example rclcpp executables. Below is on Galactic, I saw it first on foxy.

$ docker run --platform=linux/aarch64 -ti ros bash
Unable to find image 'ros:latest' locally
latest: Pulling from library/ros
Digest: sha256:dec83e0668b43f0f7734935d91a1c9c7a89fba6dd878cd2bd6aedffd196132c3
Status: Downloaded newer image for ros:latest
root@66357d5ba0db:/# apt-get update -qq && apt-get install -qqy ros-galactic-examples-rclcpp-minimal-subscriber ros-galactic-examples-rclcpp-minimal-publisher
## INSTALL LOGS TRUNCATED
root@66357d5ba0db:/# . /opt/ros/galactic/setup.bash
root@66357d5ba0db:/# export ROS_DOMAIN_ID=84
root@66357d5ba0db:/# ros2 run examples_rclcpp_minimal_publisher publisher_member_function
Unsupported setsockopt level=0 optname=32
1643936991.678175 [84] publisher_: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter
[ERROR] [1643936991.681629437] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error

>>> [rcutils|error_handling.c:108] rcutils_set_error_state()
This error state is being overwritten:

  'error not set, at /tmp/binarydeb/ros-galactic-rcl-3.1.2/src/rcl/node.c:261'

with this new error message:

  'rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-galactic-rcl-3.1.2/src/rcl/node.c:413'

rcutils_reset_error() should be called after error handling to avoid this.
<<<
[ERROR] [1643936991.685166614] [rcl]: Failed to fini publisher for node: 1
terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
  what():  failed to initialize rcl node: rcl node's rmw handle is invalid, at /tmp/binarydeb/ros-galactic-rcl-3.1.2/src/rcl/node.c:413
qemu: uncaught target signal 6 (Aborted) - core dumped
eboasson commented 2 years ago

Hi @tfoote, one would think the patch in linked to in https://github.com/ros2/rmw_cyclonedds/issues/273#issuecomment-930915050 will fix it by not doing IP_MULTICAST_IF when multicast is not enabled. It is not in the 0.8.x because I didn't want to change that branch without having confirmed that it solved the problem, but if it works I'll merge it.

But I'd still like to know what's going on, and so if you have a chance to run reproduce with Cyclone tracing enabled (i.e., export CYCLONEDDS_URI='<Tr><C>trace</C><Out>cdds.log.${CYCLONEDDS_PID}</></>' and with strace to get the exact parameters passed to the system call, I'd be grateful. The latter is the only way I can think of to be absolutely certain that it gets passed the right parameters (I think that's likely, I just want to be sure), and the former has everything to tell me if it ends up with multicast enabled or disabled internally.

tfoote commented 2 years ago

Here's an example of the cdds.log generated.

``` root@caa9f7853d37:/# cat cdds.log.2608 1649459589.202664 [84] publisher_: config: Domain/General/NetworkInterfaceAddress/#text: auto {} 1649459589.204019 [84] publisher_: config: Domain/General/MulticastRecvNetworkInterfaceAddresses/#text: preferred {} 1649459589.204106 [84] publisher_: config: Domain/General/ExternalNetworkAddress/#text: auto {} 1649459589.204146 [84] publisher_: config: Domain/General/ExternalNetworkMask/#text: 0.0.0.0 {} 1649459589.204184 [84] publisher_: config: Domain/General/AllowMulticast/#text: default {} 1649459589.204226 [84] publisher_: config: Domain/General/PreferMulticast/#text: false {} 1649459589.204257 [84] publisher_: config: Domain/General/MulticastTimeToLive/#text: 32 {} 1649459589.204276 [84] publisher_: config: Domain/General/DontRoute/#text: false {} 1649459589.204336 [84] publisher_: config: Domain/General/Transport/#text: udp {} 1649459589.204367 [84] publisher_: config: Domain/General/EnableMulticastLoopback/#text: true {} 1649459589.204495 [84] publisher_: config: Domain/General/MaxMessageSize/#text: 14720 B {} 1649459589.204516 [84] publisher_: config: Domain/General/MaxRexmitMessageSize/#text: 1456 B {} 1649459589.204549 [84] publisher_: config: Domain/General/FragmentSize/#text: 1344 B {} 1649459589.204568 [84] publisher_: config: Domain/General/RedundantNetworking/#text: false {} 1649459589.204626 [84] publisher_: config: Domain/Sizing/ReceiveBufferSize/#text: 1 MiB {} 1649459589.204646 [84] publisher_: config: Domain/Sizing/ReceiveBufferChunkSize/#text: 128 KiB {} 1649459589.204703 [84] publisher_: config: Domain/Compatibility/StandardsConformance/#text: lax {} 1649459589.204727 [84] publisher_: config: Domain/Compatibility/ExplicitlyPublishQosSetToDefault/#text: false {} 1649459589.204772 [84] publisher_: config: Domain/Compatibility/ManySocketsMode/#text: single {} 1649459589.204791 [84] publisher_: config: Domain/Compatibility/AssumeRtiHasPmdEndpoints/#text: false {} 1649459589.204809 [84] publisher_: config: Domain/Discovery/Tag/#text: {} 1649459589.204846 [84] publisher_: config: Domain/Discovery/ExternalDomainId/#text: 84 {} 1649459589.204885 [84] publisher_: config: Domain/Discovery/DSGracePeriod/#text: 30 s {} 1649459589.204924 [84] publisher_: config: Domain/Discovery/ParticipantIndex/#text: none {} 1649459589.204943 [84] publisher_: config: Domain/Discovery/MaxAutoParticipantIndex/#text: 9 {} 1649459589.204960 [84] publisher_: config: Domain/Discovery/SPDPMulticastAddress/#text: 239.255.0.1 {} 1649459589.204979 [84] publisher_: config: Domain/Discovery/SPDPInterval/#text: 30 s {} 1649459589.204997 [84] publisher_: config: Domain/Discovery/DefaultMulticastAddress/#text: auto {} 1649459589.205030 [84] publisher_: config: Domain/Discovery/Ports/Base/#text: 7400 {} 1649459589.205050 [84] publisher_: config: Domain/Discovery/Ports/DomainGain/#text: 250 {} 1649459589.205069 [84] publisher_: config: Domain/Discovery/Ports/ParticipantGain/#text: 2 {} 1649459589.205089 [84] publisher_: config: Domain/Discovery/Ports/MulticastMetaOffset/#text: 0 {} 1649459589.205108 [84] publisher_: config: Domain/Discovery/Ports/UnicastMetaOffset/#text: 10 {} 1649459589.205134 [84] publisher_: config: Domain/Discovery/Ports/MulticastDataOffset/#text: 1 {} 1649459589.205154 [84] publisher_: config: Domain/Discovery/Ports/UnicastDataOffset/#text: 11 {} 1649459589.205699 [84] publisher_: config: Domain/Tracing/Category/#text: trace {0} 1649459589.205760 [84] publisher_: config: Domain/Tracing/OutputFile/#text: cdds.log.2608 {0} 1649459589.205780 [84] publisher_: config: Domain/Tracing/AppendToFile/#text: false {} 1649459589.205799 [84] publisher_: config: Domain/Tracing/PacketCaptureFile/#text: {} 1649459589.205818 [84] publisher_: config: Domain/Internal/DeliveryQueueMaxSamples/#text: 256 {} 1649459589.205836 [84] publisher_: config: Domain/Internal/PrimaryReorderMaxSamples/#text: 128 {} 1649459589.205854 [84] publisher_: config: Domain/Internal/SecondaryReorderMaxSamples/#text: 128 {} 1649459589.205873 [84] publisher_: config: Domain/Internal/DefragUnreliableMaxSamples/#text: 4 {} 1649459589.205890 [84] publisher_: config: Domain/Internal/DefragReliableMaxSamples/#text: 16 {} 1649459589.205960 [84] publisher_: config: Domain/Internal/BuiltinEndpointSet/#text: writers {} 1649459589.205980 [84] publisher_: config: Domain/Internal/MeasureHbToAckLatency/#text: false {} 1649459589.205998 [84] publisher_: config: Domain/Internal/UnicastResponseToSPDPMessages/#text: true {} 1649459589.206016 [84] publisher_: config: Domain/Internal/SynchronousDeliveryPriorityThreshold/#text: 0 {} 1649459589.206038 [84] publisher_: config: Domain/Internal/SynchronousDeliveryLatencyBound/#text: inf {} 1649459589.206057 [84] publisher_: config: Domain/Internal/MaxParticipants/#text: 0 {} 1649459589.206075 [84] publisher_: config: Domain/Internal/AccelerateRexmitBlockSize/#text: 0 {} 1649459589.206120 [84] publisher_: config: Domain/Internal/RetransmitMerging/#text: never {} 1649459589.206140 [84] publisher_: config: Domain/Internal/RetransmitMergingPeriod/#text: 5 ms {} 1649459589.206158 [84] publisher_: config: Domain/Internal/HeartbeatInterval/#text: 100 ms {} 1649459589.206198 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@min]: 5 ms {} 1649459589.206218 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@minsched]: 20 ms {} 1649459589.206236 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@max]: 8 s {} 1649459589.206255 [84] publisher_: config: Domain/Internal/MaxQueuedRexmitBytes/#text: 512 KiB {} 1649459589.206273 [84] publisher_: config: Domain/Internal/MaxQueuedRexmitMessages/#text: 200 {} 1649459589.206291 [84] publisher_: config: Domain/Internal/LeaseDuration/#text: 10 s {} 1649459589.206309 [84] publisher_: config: Domain/Internal/WriterLingerDuration/#text: 1 s {} 1649459589.206346 [84] publisher_: config: Domain/Internal/MinimumSocketReceiveBufferSize/#text: default {} 1649459589.206365 [84] publisher_: config: Domain/Internal/MinimumSocketSendBufferSize/#text: 64 KiB {} 1649459589.206383 [84] publisher_: config: Domain/Internal/NackDelay/#text: 100 ms {} 1649459589.206401 [84] publisher_: config: Domain/Internal/AckDelay/#text: 10 ms {} 1649459589.206419 [84] publisher_: config: Domain/Internal/AutoReschedNackDelay/#text: 1 s {} 1649459589.206437 [84] publisher_: config: Domain/Internal/PreEmptiveAckDelay/#text: 10 ms {} 1649459589.206459 [84] publisher_: config: Domain/Internal/ScheduleTimeRounding/#text: 0 s {} 1649459589.206478 [84] publisher_: config: Domain/Internal/DDSI2DirectMaxThreads/#text: 1 {} 1649459589.206496 [84] publisher_: config: Domain/Internal/SquashParticipants/#text: false {} 1649459589.206513 [84] publisher_: config: Domain/Internal/SPDPResponseMaxDelay/#text: 0 s {} 1649459589.206530 [84] publisher_: config: Domain/Internal/LateAckMode/#text: false {} 1649459589.206548 [84] publisher_: config: Domain/Internal/RetryOnRejectBestEffort/#text: false {} 1649459589.206565 [84] publisher_: config: Domain/Internal/GenerateKeyhash/#text: false {} 1649459589.206584 [84] publisher_: config: Domain/Internal/MaxSampleSize/#text: 2147483647 B {} 1649459589.206601 [84] publisher_: config: Domain/Internal/WriteBatch/#text: false {} 1649459589.206619 [84] publisher_: config: Domain/Internal/LivelinessMonitoring/#text: false {} 1649459589.206637 [84] publisher_: config: Domain/Internal/LivelinessMonitoring[@StackTraces]: true {} 1649459589.206655 [84] publisher_: config: Domain/Internal/LivelinessMonitoring[@Interval]: 1 s {} 1649459589.206718 [84] publisher_: config: Domain/Internal/MonitorPort/#text: -1 {} 1649459589.206737 [84] publisher_: config: Domain/Internal/AssumeMulticastCapable/#text: {} 1649459589.206755 [84] publisher_: config: Domain/Internal/PrioritizeRetransmit/#text: true {} 1649459589.206772 [84] publisher_: config: Domain/Internal/UseMulticastIfMreqn/#text: 0 {} 1649459589.206790 [84] publisher_: config: Domain/Internal/RediscoveryBlacklistDuration/#text: 0 s {} 1649459589.206809 [84] publisher_: config: Domain/Internal/RediscoveryBlacklistDuration[@enforce]: false {} 1649459589.206855 [84] publisher_: config: Domain/Internal/MultipleReceiveThreads/#text: default {} 1649459589.206875 [84] publisher_: config: Domain/Internal/MultipleReceiveThreads[@maxretries]: 4294967295 {} 1649459589.206899 [84] publisher_: config: Domain/Internal/Test/XmitLossiness/#text: 0 {} 1649459589.206920 [84] publisher_: config: Domain/Internal/Watermarks/WhcLow/#text: 1 KiB {} 1649459589.206940 [84] publisher_: config: Domain/Internal/Watermarks/WhcHigh/#text: 500 KiB {} 1649459589.206967 [84] publisher_: config: Domain/Internal/Watermarks/WhcHighInit/#text: 30 KiB {} 1649459589.206987 [84] publisher_: config: Domain/Internal/Watermarks/WhcAdaptive/#text: true {} 1649459589.207007 [84] publisher_: config: Domain/Internal/BurstSize/MaxRexmit/#text: 1 MiB {} 1649459589.207027 [84] publisher_: config: Domain/Internal/BurstSize/MaxInitTransmit/#text: 4294967295 {} 1649459589.207060 [84] publisher_: config: Domain/Internal/EnableExpensiveChecks/#text: {} 1649459589.207080 [84] publisher_: config: Domain/TCP/NoDelay/#text: true {} 1649459589.207098 [84] publisher_: config: Domain/TCP/Port/#text: -1 {} 1649459589.207117 [84] publisher_: config: Domain/TCP/ReadTimeout/#text: 2 s {} 1649459589.207135 [84] publisher_: config: Domain/TCP/WriteTimeout/#text: 2 s {} 1649459589.207153 [84] publisher_: config: Domain/TCP/AlwaysUsePeeraddrForUnicast/#text: false {} 1649459589.207171 [84] publisher_: config: Domain/SSL/Enable/#text: false {} 1649459589.207188 [84] publisher_: config: Domain/SSL/CertificateVerification/#text: true {} 1649459589.207206 [84] publisher_: config: Domain/SSL/VerifyClient/#text: true {} 1649459589.207224 [84] publisher_: config: Domain/SSL/SelfSignedCertificates/#text: false {} 1649459589.207242 [84] publisher_: config: Domain/SSL/KeystoreFile/#text: keystore {} 1649459589.207259 [84] publisher_: config: Domain/SSL/KeyPassphrase/#text: secret {} 1649459589.207277 [84] publisher_: config: Domain/SSL/Ciphers/#text: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH {} 1649459589.207294 [84] publisher_: config: Domain/SSL/EntropyFile/#text: {} 1649459589.207328 [84] publisher_: config: Domain/SSL/MinimumTLSVersion/#text: 1.3 {} 1649459589.207347 [84] publisher_: config: Domain/SharedMemory/Enable/#text: false {} 1649459589.207364 [84] publisher_: config: Domain/SharedMemory/Locator/#text: {} 1649459589.207381 [84] publisher_: config: Domain/SharedMemory/Prefix/#text: DDS_CYCLONE {} 1649459589.207438 [84] publisher_: config: Domain/SharedMemory/LogLevel/#text: info {} 1649459589.207457 [84] publisher_: config: Domain/SharedMemory/SubQueueCapacity/#text: 256 {} 1649459589.207475 [84] publisher_: config: Domain/SharedMemory/SubHistoryRequest/#text: 16 {} 1649459589.207492 [84] publisher_: config: Domain/SharedMemory/PubHistoryCapacity/#text: 16 {} 1649459589.207526 [84] publisher_: config: Domain[@Id]: 84 {} 1649459589.211903 [84] publisher_: started at 1649459589.06207764 -- 2022-04-08 23:13:09+00:00 1649459589.212057 [84] publisher_: udp initialized 1649459589.215910 [84] publisher_: interfaces: lo udp/127.0.0.1(q1) eth0 udp/172.17.0.3(q9) 1649459589.216189 [84] publisher_: selected interfaces: eth0 (index 21) 1649459589.216276 [84] publisher_: presumed robust multicast support, use for everything 1649459589.217911 [84] publisher_: ownip: udp/172.17.0.3 1649459589.217953 [84] publisher_: extmask: invalid/0 (not applicable) 1649459589.217979 [84] publisher_: SPDP MC: udp/239.255.0.1 1649459589.218003 [84] publisher_: default MC: udp/239.255.0.1 1649459589.218019 [84] publisher_: SSM support included 1649459589.221106 [84] publisher_: failed to increase socket receive buffer size to 1048576 bytes, continuing with 425984 bytes 1649459589.221357 [84] publisher_: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter 1649459589.223184 [84] publisher_: udp finalized root@caa9f7853d37:/# cat cdds.log.2519 1649459526.660830 [84] publisher_: config: Domain/General/NetworkInterfaceAddress/#text: auto {} 1649459526.662203 [84] publisher_: config: Domain/General/MulticastRecvNetworkInterfaceAddresses/#text: preferred {} 1649459526.662290 [84] publisher_: config: Domain/General/ExternalNetworkAddress/#text: auto {} 1649459526.662330 [84] publisher_: config: Domain/General/ExternalNetworkMask/#text: 0.0.0.0 {} 1649459526.662380 [84] publisher_: config: Domain/General/AllowMulticast/#text: default {} 1649459526.662424 [84] publisher_: config: Domain/General/PreferMulticast/#text: false {} 1649459526.662461 [84] publisher_: config: Domain/General/MulticastTimeToLive/#text: 32 {} 1649459526.662490 [84] publisher_: config: Domain/General/DontRoute/#text: false {} 1649459526.662556 [84] publisher_: config: Domain/General/Transport/#text: udp {} 1649459526.662588 [84] publisher_: config: Domain/General/EnableMulticastLoopback/#text: true {} 1649459526.662718 [84] publisher_: config: Domain/General/MaxMessageSize/#text: 14720 B {} 1649459526.662739 [84] publisher_: config: Domain/General/MaxRexmitMessageSize/#text: 1456 B {} 1649459526.662772 [84] publisher_: config: Domain/General/FragmentSize/#text: 1344 B {} 1649459526.662790 [84] publisher_: config: Domain/General/RedundantNetworking/#text: false {} 1649459526.662850 [84] publisher_: config: Domain/Sizing/ReceiveBufferSize/#text: 1 MiB {} 1649459526.662869 [84] publisher_: config: Domain/Sizing/ReceiveBufferChunkSize/#text: 128 KiB {} 1649459526.662929 [84] publisher_: config: Domain/Compatibility/StandardsConformance/#text: lax {} 1649459526.662963 [84] publisher_: config: Domain/Compatibility/ExplicitlyPublishQosSetToDefault/#text: false {} 1649459526.663012 [84] publisher_: config: Domain/Compatibility/ManySocketsMode/#text: single {} 1649459526.663031 [84] publisher_: config: Domain/Compatibility/AssumeRtiHasPmdEndpoints/#text: false {} 1649459526.663049 [84] publisher_: config: Domain/Discovery/Tag/#text: {} 1649459526.663086 [84] publisher_: config: Domain/Discovery/ExternalDomainId/#text: 84 {} 1649459526.663126 [84] publisher_: config: Domain/Discovery/DSGracePeriod/#text: 30 s {} 1649459526.663166 [84] publisher_: config: Domain/Discovery/ParticipantIndex/#text: none {} 1649459526.663184 [84] publisher_: config: Domain/Discovery/MaxAutoParticipantIndex/#text: 9 {} 1649459526.663202 [84] publisher_: config: Domain/Discovery/SPDPMulticastAddress/#text: 239.255.0.1 {} 1649459526.663220 [84] publisher_: config: Domain/Discovery/SPDPInterval/#text: 30 s {} 1649459526.663238 [84] publisher_: config: Domain/Discovery/DefaultMulticastAddress/#text: auto {} 1649459526.663271 [84] publisher_: config: Domain/Discovery/Ports/Base/#text: 7400 {} 1649459526.663292 [84] publisher_: config: Domain/Discovery/Ports/DomainGain/#text: 250 {} 1649459526.663311 [84] publisher_: config: Domain/Discovery/Ports/ParticipantGain/#text: 2 {} 1649459526.663330 [84] publisher_: config: Domain/Discovery/Ports/MulticastMetaOffset/#text: 0 {} 1649459526.663350 [84] publisher_: config: Domain/Discovery/Ports/UnicastMetaOffset/#text: 10 {} 1649459526.663369 [84] publisher_: config: Domain/Discovery/Ports/MulticastDataOffset/#text: 1 {} 1649459526.663388 [84] publisher_: config: Domain/Discovery/Ports/UnicastDataOffset/#text: 11 {} 1649459526.663926 [84] publisher_: config: Domain/Tracing/Category/#text: trace {0} 1649459526.663985 [84] publisher_: config: Domain/Tracing/OutputFile/#text: cdds.log.2519 {0} 1649459526.664005 [84] publisher_: config: Domain/Tracing/AppendToFile/#text: false {} 1649459526.664023 [84] publisher_: config: Domain/Tracing/PacketCaptureFile/#text: {} 1649459526.664042 [84] publisher_: config: Domain/Internal/DeliveryQueueMaxSamples/#text: 256 {} 1649459526.664060 [84] publisher_: config: Domain/Internal/PrimaryReorderMaxSamples/#text: 128 {} 1649459526.664077 [84] publisher_: config: Domain/Internal/SecondaryReorderMaxSamples/#text: 128 {} 1649459526.664095 [84] publisher_: config: Domain/Internal/DefragUnreliableMaxSamples/#text: 4 {} 1649459526.664113 [84] publisher_: config: Domain/Internal/DefragReliableMaxSamples/#text: 16 {} 1649459526.664181 [84] publisher_: config: Domain/Internal/BuiltinEndpointSet/#text: writers {} 1649459526.664201 [84] publisher_: config: Domain/Internal/MeasureHbToAckLatency/#text: false {} 1649459526.664219 [84] publisher_: config: Domain/Internal/UnicastResponseToSPDPMessages/#text: true {} 1649459526.664247 [84] publisher_: config: Domain/Internal/SynchronousDeliveryPriorityThreshold/#text: 0 {} 1649459526.664275 [84] publisher_: config: Domain/Internal/SynchronousDeliveryLatencyBound/#text: inf {} 1649459526.664294 [84] publisher_: config: Domain/Internal/MaxParticipants/#text: 0 {} 1649459526.664312 [84] publisher_: config: Domain/Internal/AccelerateRexmitBlockSize/#text: 0 {} 1649459526.664358 [84] publisher_: config: Domain/Internal/RetransmitMerging/#text: never {} 1649459526.664377 [84] publisher_: config: Domain/Internal/RetransmitMergingPeriod/#text: 5 ms {} 1649459526.664395 [84] publisher_: config: Domain/Internal/HeartbeatInterval/#text: 100 ms {} 1649459526.664435 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@min]: 5 ms {} 1649459526.664454 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@minsched]: 20 ms {} 1649459526.664473 [84] publisher_: config: Domain/Internal/HeartbeatInterval[@max]: 8 s {} 1649459526.664491 [84] publisher_: config: Domain/Internal/MaxQueuedRexmitBytes/#text: 512 KiB {} 1649459526.664509 [84] publisher_: config: Domain/Internal/MaxQueuedRexmitMessages/#text: 200 {} 1649459526.664527 [84] publisher_: config: Domain/Internal/LeaseDuration/#text: 10 s {} 1649459526.664545 [84] publisher_: config: Domain/Internal/WriterLingerDuration/#text: 1 s {} 1649459526.664581 [84] publisher_: config: Domain/Internal/MinimumSocketReceiveBufferSize/#text: default {} 1649459526.664600 [84] publisher_: config: Domain/Internal/MinimumSocketSendBufferSize/#text: 64 KiB {} 1649459526.664619 [84] publisher_: config: Domain/Internal/NackDelay/#text: 100 ms {} 1649459526.664636 [84] publisher_: config: Domain/Internal/AckDelay/#text: 10 ms {} 1649459526.664654 [84] publisher_: config: Domain/Internal/AutoReschedNackDelay/#text: 1 s {} 1649459526.664672 [84] publisher_: config: Domain/Internal/PreEmptiveAckDelay/#text: 10 ms {} 1649459526.664694 [84] publisher_: config: Domain/Internal/ScheduleTimeRounding/#text: 0 s {} 1649459526.664712 [84] publisher_: config: Domain/Internal/DDSI2DirectMaxThreads/#text: 1 {} 1649459526.664730 [84] publisher_: config: Domain/Internal/SquashParticipants/#text: false {} 1649459526.664747 [84] publisher_: config: Domain/Internal/SPDPResponseMaxDelay/#text: 0 s {} 1649459526.664764 [84] publisher_: config: Domain/Internal/LateAckMode/#text: false {} 1649459526.664782 [84] publisher_: config: Domain/Internal/RetryOnRejectBestEffort/#text: false {} 1649459526.664799 [84] publisher_: config: Domain/Internal/GenerateKeyhash/#text: false {} 1649459526.664817 [84] publisher_: config: Domain/Internal/MaxSampleSize/#text: 2147483647 B {} 1649459526.664835 [84] publisher_: config: Domain/Internal/WriteBatch/#text: false {} 1649459526.664852 [84] publisher_: config: Domain/Internal/LivelinessMonitoring/#text: false {} 1649459526.664870 [84] publisher_: config: Domain/Internal/LivelinessMonitoring[@StackTraces]: true {} 1649459526.664889 [84] publisher_: config: Domain/Internal/LivelinessMonitoring[@Interval]: 1 s {} 1649459526.664953 [84] publisher_: config: Domain/Internal/MonitorPort/#text: -1 {} 1649459526.664972 [84] publisher_: config: Domain/Internal/AssumeMulticastCapable/#text: {} 1649459526.664989 [84] publisher_: config: Domain/Internal/PrioritizeRetransmit/#text: true {} 1649459526.665007 [84] publisher_: config: Domain/Internal/UseMulticastIfMreqn/#text: 0 {} 1649459526.665025 [84] publisher_: config: Domain/Internal/RediscoveryBlacklistDuration/#text: 0 s {} 1649459526.665043 [84] publisher_: config: Domain/Internal/RediscoveryBlacklistDuration[@enforce]: false {} 1649459526.665088 [84] publisher_: config: Domain/Internal/MultipleReceiveThreads/#text: default {} 1649459526.665108 [84] publisher_: config: Domain/Internal/MultipleReceiveThreads[@maxretries]: 4294967295 {} 1649459526.665142 [84] publisher_: config: Domain/Internal/Test/XmitLossiness/#text: 0 {} 1649459526.665164 [84] publisher_: config: Domain/Internal/Watermarks/WhcLow/#text: 1 KiB {} 1649459526.665183 [84] publisher_: config: Domain/Internal/Watermarks/WhcHigh/#text: 500 KiB {} 1649459526.665210 [84] publisher_: config: Domain/Internal/Watermarks/WhcHighInit/#text: 30 KiB {} 1649459526.665230 [84] publisher_: config: Domain/Internal/Watermarks/WhcAdaptive/#text: true {} 1649459526.665250 [84] publisher_: config: Domain/Internal/BurstSize/MaxRexmit/#text: 1 MiB {} 1649459526.665269 [84] publisher_: config: Domain/Internal/BurstSize/MaxInitTransmit/#text: 4294967295 {} 1649459526.665303 [84] publisher_: config: Domain/Internal/EnableExpensiveChecks/#text: {} 1649459526.665323 [84] publisher_: config: Domain/TCP/NoDelay/#text: true {} 1649459526.665340 [84] publisher_: config: Domain/TCP/Port/#text: -1 {} 1649459526.665359 [84] publisher_: config: Domain/TCP/ReadTimeout/#text: 2 s {} 1649459526.665377 [84] publisher_: config: Domain/TCP/WriteTimeout/#text: 2 s {} 1649459526.665394 [84] publisher_: config: Domain/TCP/AlwaysUsePeeraddrForUnicast/#text: false {} 1649459526.665412 [84] publisher_: config: Domain/SSL/Enable/#text: false {} 1649459526.665430 [84] publisher_: config: Domain/SSL/CertificateVerification/#text: true {} 1649459526.665447 [84] publisher_: config: Domain/SSL/VerifyClient/#text: true {} 1649459526.665465 [84] publisher_: config: Domain/SSL/SelfSignedCertificates/#text: false {} 1649459526.665482 [84] publisher_: config: Domain/SSL/KeystoreFile/#text: keystore {} 1649459526.665499 [84] publisher_: config: Domain/SSL/KeyPassphrase/#text: secret {} 1649459526.665517 [84] publisher_: config: Domain/SSL/Ciphers/#text: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH {} 1649459526.665533 [84] publisher_: config: Domain/SSL/EntropyFile/#text: {} 1649459526.665567 [84] publisher_: config: Domain/SSL/MinimumTLSVersion/#text: 1.3 {} 1649459526.665585 [84] publisher_: config: Domain/SharedMemory/Enable/#text: false {} 1649459526.665603 [84] publisher_: config: Domain/SharedMemory/Locator/#text: {} 1649459526.665620 [84] publisher_: config: Domain/SharedMemory/Prefix/#text: DDS_CYCLONE {} 1649459526.665675 [84] publisher_: config: Domain/SharedMemory/LogLevel/#text: info {} 1649459526.665694 [84] publisher_: config: Domain/SharedMemory/SubQueueCapacity/#text: 256 {} 1649459526.665712 [84] publisher_: config: Domain/SharedMemory/SubHistoryRequest/#text: 16 {} 1649459526.665730 [84] publisher_: config: Domain/SharedMemory/PubHistoryCapacity/#text: 16 {} 1649459526.665763 [84] publisher_: config: Domain[@Id]: 84 {} 1649459526.670149 [84] publisher_: started at 1649459526.06665991 -- 2022-04-08 23:12:06+00:00 1649459526.670307 [84] publisher_: udp initialized 1649459526.674196 [84] publisher_: interfaces: lo udp/127.0.0.1(q1) eth0 udp/172.17.0.3(q9) 1649459526.674568 [84] publisher_: selected interfaces: eth0 (index 21) 1649459526.674678 [84] publisher_: presumed robust multicast support, use for everything 1649459526.676403 [84] publisher_: ownip: udp/172.17.0.3 1649459526.676448 [84] publisher_: extmask: invalid/0 (not applicable) 1649459526.676473 [84] publisher_: SPDP MC: udp/239.255.0.1 1649459526.676496 [84] publisher_: default MC: udp/239.255.0.1 1649459526.676513 [84] publisher_: SSM support included 1649459526.679645 [84] publisher_: failed to increase socket receive buffer size to 1048576 bytes, continuing with 425984 bytes 1649459526.679913 [84] publisher_: ddsi_udp_create_conn: set IP_MULTICAST_IF failed: Bad Parameter 1649459526.681850 [84] publisher_: udp finalized root@caa9f7853d37:/# ```

To reproduce this:

docker run --platform=linux/aarch64 -ti ros bash

And inside:

export CYCLONEDDS_URI='<Tr><C>trace</C><Out>cdds.log.${CYCLONEDDS_PID}</></>'
apt-get install -qqy ros-galactic-examples-rclcpp-minimal-subscriber ros-galactic-examples-rclcpp-minimal-publisher
. /opt/ros/galactic/setup.bash
export ROS_DOMAIN_ID=84
apt-get install strace
strace /opt/ros/galactic/lib/examples_rclcpp_minimal_publisher/publisher_member_function
cat cdds.log.*