ros2 / rmw_cyclonedds

ROS 2 RMW layer for Eclipse Cyclone DDS
Apache License 2.0
108 stars 89 forks source link

Segmentation fault (core dumped) when running ddsperf -L -TOU -D10 pub sub with shared memory enabled #449

Open arrfou99 opened 1 year ago

arrfou99 commented 1 year ago

There is unexpected behavior when running iox-roudi with shared memory enabled. Demo cpp loaned messages works okay but it does not work properly if we run multiple time and close. With ddsperf if we do ctrl-c it gives a segmentation fault. Also, the same behavior in my program.

Bug report

Required Info:

Steps to reproduce issue

open terminal #1 $ ddsperf -L -TOU -D10 pub sub

Expected behavior

pup/sub with shared memory

Actual behavior

output 

> ddsperf -L -TOU -D10 pub sub
1680704023.748889 [0]    ddsperf: using network interface eth0 (udp/192.168.0.1) selected arbitrarily from: eth0, docker0
2023-04-05 14:13:43.754 [ Debug ]: Application registered management segment 0xffff887b0000 with size 67106416 to id 1
2023-04-05 14:13:43.755 [ Debug ]: Application registered payload data segment 0xffff6f461000 with size 422898400 to id 2
[486005] participant rs4-xavier-1:486005: new (self)
[486005] 1.003 5.56k/s  59u |                              | 100%   4m
[486005] 1.003  size 4 total 5580 lost 0 delta 5580 lost 0 rate 5.56 kS/s 0.18 Mb/s (0.56 kS/s 0.02 Mb/s)
Segmentation fault (core dumped)

Additional information

here is the used cyclonedds.xml

<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/iceoryx/etc/cyclonedds.xsd">
    <Domain id="any">
        <SharedMemory>
            <Enable>true</Enable>
            <LogLevel>debug</LogLevel>
        </SharedMemory>
    </Domain>
</CycloneDDS>
eboasson commented 1 year ago

It reproduces beautifully on macOS even — and it turns out to be solved in 0.10.

@clalancette Could you please let me know if you need me to fix it on 0.9? It might not be worth the bother given that 0.10 is looking good according to https://github.com/ros2/ros2/pull/1404.

clalancette commented 1 year ago

@clalancette Could you please let me know if you need me to fix it on 0.9? It might not be worth the bother given that 0.10 is looking good according to ros2/ros2#1404.

The main reason to consider fixing it in 0.9 is for Humble, which will stay on the 0.9 series for its lifetime. If it is a relatively easy fix to backport to 0.9, I would say it is worthwhile. If it is more complicated, then we probably need to see if someone from the community has time to debug and fix it there. Does that make sense?

eboasson commented 1 year ago

@clalancette Yep, that sounds sensible. I'll have a look.

eboasson commented 1 year ago

Spoke too soon 😡

It is broken in all versions, including master. The crash was avoided in the more recent ddsperf because of some other detail ... I'll deal with it on master first, then backport that fix to 0.9 & 0.10.

eboasson commented 1 year ago

@arrfou99 The bug that caused the crash in ddsperf is fixed (in 0.9.x, 0.10.x and master). You also mention something about your own program and as there is always a risk that there is another bug that just happens to give the same symptoms, I'd like a confirmation before I close this ticket as "fixed".