eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
891 stars 363 forks source link

ros2 commands fail on ROS 2 Jazzy when RMW_IMPLEMENTATION=rmw_cyclonedds_cpp #2043

Closed rubenanapu closed 5 months ago

rubenanapu commented 5 months ago

Description

I have installed ROS 2 Jazzy on Ubuntu 22.04.

If I just ros2 topic list and ros2 run demo_nodes_cpp talker on the same PC it works.

But if I want to see topics from another PC, it fails after setting the variables below:

export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
export CYCLONEDDS_URI=/path/to/my/cyclonedds.xml

After exporting this, any ros2 commands fail.

 ros2 run demo_nodes_cpp talker
*** buffer overflow detected ***: terminated
ros2 topic list 
*** buffer overflow detected ***: terminated
Aborted (core dumped)

If I run ros2 daemon stop I don't have any errors, but as soon as I try ros2 topic list again I have the same error.

My cyclonedds.xml file is as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
    <Domain id="any">
        <General>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>6550B</MaxMessageSize>
            <FragmentSize>4000B</FragmentSize>
            <Transport>udp6</Transport>
        </General>
        <Discovery>
            <Peers>

 <Peer address="remote-pc"/>
 <Peer address="IPV6-HUSARNET-IP-OF-REMOTE-PC"/>
 <Peer address="IPV6-HUSARNET-IP-OF-LOCAL-PC"/>
 <Peer address="husarnet-local"/>
  <Peer address="master"/>

            </Peers>
            <MaxAutoParticipantIndex>100</MaxAutoParticipantIndex>
            <ParticipantIndex>auto</ParticipantIndex>
        </Discovery>
        <Internal>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
        <Tracing>
            <Verbosity>severe</Verbosity>
            <OutputFile>stdout</OutputFile>
        </Tracing>
    </Domain>
</CycloneDDS>

It is worth mentioning that the IPV6 is obtained through Husarnet:

If I try the same thing using ROS 2 Humble, Galactic, or Iron, I don't have this problem. It only happens with Jazzy. Also, I tested RMW_IMPLEMENTATION=rmw_fastrtps_cpp and with fastrtps I don't have this error. The problem seems to be only with Cyclone DDS.

$ export RMW_IMPLEMENTATION=rmw_fastrtps_cpp 
$ ros2 run demo_nodes_cpp talker
[INFO] [1718658361.109334174] [talker]: Publishing: 'Hello World: 1'
[INFO] [1718658362.109322134] [talker]: Publishing: 'Hello World: 2'

The problem seems to be in a method called dds_create_domain, as we can see in the bottom of the highlighted section in the screenshot below:

dds_create_domain call

Also, in the next screenshot, we can see the command line that causes the error. The screenshot is of the "Crash Error Report" tool on Ubuntu:

command line

Does anybody have any clues on what could be the reason behind this?

By the way, I posted the same question on the link below:

rubenanapu commented 5 months ago

I found the solution.

I just compiled CycloneDDS from source and the problem went away automatically.

The link where I found the instructions on how to compile from source for ROS 2 Jazzy is the following:

The commands I used for compiling it from source can be summarized as follows:

mkdir -pv ~/ros2_ws/src

cd ~/ros2_ws/src

source /opt/ros/jazzy/setup.bash

source ~/ros2_ws/install/setup.bash

git clone https://github.com/ros2/rmw_cyclonedds ros2/rmw_cyclonedds -b jazzy

git clone https://github.com/eclipse-cyclonedds/cyclonedds -b 0.10.5 eclipse-cyclonedds/cyclonedds

cd ..

rosdep install -i --from-path src --rosdistro jazzy -y

colcon build

After that, I just sourced the ros2_ws and it worked as expected:

source ~/ros2_ws/install/setup.bash

ros2 run demo_nodes_py talker

It is worth mentioning that I didn't make any changes to the code. I just compiled it from source and it worked.

eboasson commented 5 months ago

That's ... weird. Thank you for investigating and documenting the fact that rebuilding from source solves the problem.

ciandonovan commented 4 months ago

We had the same issue using the upstream Jazzy distro packages, removing the Tracing section in the XML config prevented this:

*** buffer overflow detected ***: terminated
Aborted (core dumped)
adamdbrw commented 1 month ago

This issue seems closed even though I believe it might not have been tracked or resolved fully. Rebuilding from source or removing Tracing is a kind of workaround, but we are now also seeing this issue on Humble. Any insights in what could be causing it?

Trace:

    frame #8: 0x00007ffff71364db libc.so.6`__fdelt_chk(d=<unavailable>) at fdelt_chk.c:25:5
    frame #9: 0x00007fffd6b63e8f libddsc.so.0`___lldb_unnamed_symbol3689 + 3455
    frame #10: 0x00007fffd6b65d80 libddsc.so.0`dds_create_domain + 96
    frame #11: 0x00007fffd747b797 librmw_cyclonedds_cpp.so`rmw_create_node + 5975