ros2 / rmw_cyclonedds

ROS 2 RMW layer for Eclipse Cyclone DDS
Apache License 2.0
112 stars 91 forks source link

Setting ExplicitlyPublishQosSetToDefault to true causes malloc error #353

Open eranroll opened 2 years ago

eranroll commented 2 years ago

Hi,

I am running my own node inside an arm64 ros2 galactic docker. Whenever I set true in my XML, I get an exception in my constructor, when trying to create publishers and subscribers.

I am using this parameter since I want to achieve full communication between a node running RTI's RMW and a node running Cyclone DDS.

Any idea what might be the problem?

My XML: ` <?xml version="1.0" encoding="UTF-8" ?>

true true lax auto false 65500B 1450B udp auto severe stdout

`

My publisher: rclcpp::QoS qos(rclcpp::KeepLast{7}); m_droneStatusUpdatePublisher = this->create_publisher<xtra_common_msgs::msg::MapUInt8ToString>("drone_status_update", qos);

Your prompt response is much appreciated.

Regards, Eran

eranroll commented 2 years ago

I see that the XML wasn't uploaded correctly:

<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
    <Domain id="any">
        <Compatibility>
            <AssumeRtiHasPmdEndpoints>true</AssumeRtiHasPmdEndpoints>
            <ExplicitlyPublishQosSetToDefault>true</ExplicitlyPublishQosSetToDefault>
            <StandardsConformance>lax</StandardsConformance>
        </Compatibility>
        <General>
            <NetworkInterfaceAddress>auto</NetworkInterfaceAddress>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
            <FragmentSize>1450B</FragmentSize>
            <Transport>udp</Transport>
        </General>
        <Discovery>
            <Peers>
                <Peer address="172.17.0.2"/>
            </Peers>
            <ParticipantIndex>auto</ParticipantIndex>
        </Discovery>
        <Tracing>
            <Verbosity>severe</Verbosity>
            <OutputFile>stdout</OutputFile>
        </Tracing>
    </Domain>
</CycloneDDS>
eranroll commented 2 years ago

Another thing I have found out is that this work:

        rclcpp::QoS qos(rclcpp::KeepLast{10});
        m_sendAckSubscriber = this->create_subscription<xtra_common_msgs::msg::MapUInt8ToString>("system_mode_changed", qos,
            std::bind(&DronesManager::onSystemModeChanged, this, std::placeholders::_1));

However, this doesn't:

        m_sendAckSubscriber = this->create_subscription<xtra_common_msgs::msg::MapUInt8ToString>("system_mode_changed", 10, 
            std::bind(&DronesManager::onSystemModeChanged, this, std::placeholders::_1));

See "qos" vs "10"

eboasson commented 2 years ago

Hi @eranroll

I am running my own node inside an arm64 ros2 galactic docker. Whenever I set true in my XML, I get an exception in my constructor, when trying to create publishers and subscribers.

The only valid reason for getting a exception is because the configuration is invalid, that's simple; there are only two trues in the config, those are not invalid, therefore it is a bug in Cyclone DDS (if perhaps a minor one, read on). I think you should raise it at https://github.com/eclipse-cyclonedds/cyclonedds .

I am using this parameter since I want to achieve full communication between a node running RTI's RMW and a node running Cyclone DDS.

Neither of those two weird options are necessary to get interoperation with RTI (nor any of the other DDS implementations, for that matter).

Both have their origins in the early days of getting things to interoperate. Roughly a decade ago it seemed like RTI used the "PMD" endpoints without advertising their presence and some problems disappeared by assuming it had them. There have been a few (perhaps more than a few) changes in those years on both sides, and I am not aware of interoperability problems with RTI related to these endpoints or there being any benefit from enabling the option. (Perhaps I should just remove it.)

The other one (ExplicitlyPublishQosSetToDefault) got introduced even earlier: back in the day when both this RTPS stack and TwinOaks' CoreDX were in their infancy, we could only successfully demonstrate interoperability at the OMG by using this option because CoreDX didn't apply the defaults correctly in the discovery protocol. I know it has been fixed since, I don't know when exactly, but it must also be on the order of a decade ago. (Again, perhaps I should just remove the option.)

This:

Another thing I have found out is that this work [...] See "qos" vs "10"

looks like a difference handled by rclcpp: the various ways of creating subscriptions all end up in the call in the RMW layer with all QoS set. Maybe you can have another look before we treat it as an issue in the RMW layer?