ros2 / rmw_fastrtps

Implementation of the ROS Middleware (rmw) Interface using eProsima's Fast RTPS.
Apache License 2.0
157 stars 117 forks source link

Can't use shared memory transport with initialPeersList or discovery server. #676

Open sergmister opened 1 year ago

sergmister commented 1 year ago

Bug report

I am trying to setup FastDDS to use shared memory transport for communication within a computer and UDP transport between computers, which should be possible according to the docs here. However, due to limitations with multicast traffic in my network, I either need to configure an initialPeersList or a discovery server to allow nodes across machines to discover each other. However, with either of these two enabled, shared memory transport does not work.

Required Info:

Steps to reproduce issue

fastdds-profile.xml

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
  <transport_descriptors>
    <transport_descriptor>
      <transport_id>shm_transport</transport_id>
      <type>SHM</type>
    </transport_descriptor>
  </transport_descriptors>

  <participant profile_name="participant_profile" is_default_profile="true">
    <rtps>
      <userTransports>
        <transport_id>shm_transport</transport_id>
      </userTransports>
      <useBuiltinTransports>false</useBuiltinTransports>

      <builtin>
        <metatrafficUnicastLocatorList>
          <locator>
            <udpv4>
              <address>127.0.0.1</address>
            </udpv4>
          </locator>
        </metatrafficUnicastLocatorList>
        <initialPeersList>
          <locator>
            <udpv4>
              <address>127.0.0.1</address>
            </udpv4>
          </locator>
        </initialPeersList>
      </builtin>
    </rtps>
  </participant>
</profiles>

One terminal: ros2 topic pub -r 5 /test std_msgs/String "{data: hi}" Another terminal: ros2 topic echo /test

With the above profile, the nodes will not be able to communicate (tested on ROS Humble and Rolling), but commenting out everything under <builtin> will make things work as expected.

EduPonz commented 1 year ago

Hi @sergmister,

The problem is that you're removing the builtin transports (one UDP and one SHM) and only adding an SHM transport while at the same time wanting to use a UDP transport.

What you want to achieve is the default behaviour actually, since by default participants have both an SHM and a UDP transport, and they will use SHM with remote entities that are on the same machine. I'd say the more flexible option is to use Discovery Server. Let's say you have:

You could deploy a discovery server in for instance Machine A, and have all the nodes connecting to it as clients

  1. On machine A run:

    fastdds discovery -i 0
  2. Run your nodes in machine A exporting the ROS_DISCOVERY_SERVER environment variable. For instance:

    # Create the nodes as Clients of the Discovery Server
    # running in the same machine
    export ROS_DISCOVERY_SERVER=127.0.0.1
    ros2 run <package_name> <node_name>
  3. Run your nodes in machine B exporting the ROS_DISCOVERY_SERVER environment variable. For instance:

    # Create the nodes as Clients of the Discovery Server
    # running in A
    export ROS_DISCOVERY_SERVER=192.168.1.2
    ros2 run <package_name> <node_name>

If you then want to use the ROS 2 CLI (ros2 topic list and such), you'd need to run the ROS 2 daemon as Super Client of the Discovery Server. You can read more about it in:

sergmister commented 1 year ago

I believe I was not clear enough, I can get communication working between nodes with a discovery server, but only over UDP. I need to use custom transports to reduce the the maxMessageSize of the UDP transport to 1400 otherwise large messages like images will be dropped, see here. Thus, I was testing the following profile with the discovery server on the same machine:

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
  <data_writer profile_name="data_writer_profile" is_default_profile="true">
    <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
    <qos>
      <publishMode>
        <kind>ASYNCHRONOUS</kind>
      </publishMode>
    </qos>
  </data_writer>

  <data_reader profile_name="data_reader_profile" is_default_profile="true">
    <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
  </data_reader>

  <transport_descriptors>
    <transport_descriptor>
      <transport_id>shm_transport</transport_id>
      <type>SHM</type>
    </transport_descriptor>

    <transport_descriptor>
      <transport_id>udp_transport</transport_id>
      <type>UDPv4</type>
      <maxMessageSize>1400</maxMessageSize>
    </transport_descriptor>
  </transport_descriptors>

  <participant profile_name="participant_profile" is_default_profile="true">
    <rtps>
      <userTransports>
        <transport_id>udp_transport</transport_id>
        <transport_id>shm_transport</transport_id>
      </userTransports>
      <useBuiltinTransports>false</useBuiltinTransports>

      <builtin>
        <discovery_config>
          <discoveryProtocol>SUPER_CLIENT</discoveryProtocol>
          <discoveryServersList>
            <RemoteServer prefix="44.53.00.5f.45.50.52.4f.53.49.4d.41">
              <metatrafficUnicastLocatorList>
                <locator>
                  <udpv4>
                    <address>127.0.0.1</address>
                    <port>11811</port>
                  </udpv4>
                </locator>
              </metatrafficUnicastLocatorList>
            </RemoteServer>
          </discoveryServersList>
        </discovery_config>
      </builtin>
    </rtps>
  </participant>
</profiles>

Communication works (both nodes on the same docker container), but through my tests (just a publisher and subscriber sending large arrays), this has 20x lower max bandwidth than with just shared memory without the discovery server, clearly showing that communication is going via UDP instead of shared memory as the docs say should happen. Removing the UDP transport, no communication or discovery works (If shared memory transport works this seems like it should work). My question remains, how to use shared memory transport with the discovery server or initialPeersList?

EduPonz commented 1 year ago

Hi @sergmister ,

Would it be possible for you to provide a reproducer? Just a docker file or compose and the instructions you use to run it would be enough.

sergmister commented 1 year ago

Steps to reproduce:

Put a file in current directory named fastrtps-profile.xml with the following contents:

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
  <data_writer profile_name="data_writer_profile" is_default_profile="true">
    <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
    <qos>
      <publishMode>
        <kind>ASYNCHRONOUS</kind>
      </publishMode>
    </qos>
  </data_writer>

  <data_reader profile_name="data_reader_profile" is_default_profile="true">
    <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
  </data_reader>

  <transport_descriptors>
    <transport_descriptor>
      <transport_id>shm_transport</transport_id>
      <type>SHM</type>
    </transport_descriptor>

    <transport_descriptor>
      <transport_id>udp_transport</transport_id>
      <type>UDPv4</type>
    </transport_descriptor>
  </transport_descriptors>

  <participant profile_name="participant_profile" is_default_profile="true">
    <rtps>
      <userTransports>
        <transport_id>udp_transport</transport_id>
        <transport_id>shm_transport</transport_id>
      </userTransports>
      <useBuiltinTransports>false</useBuiltinTransports>

      <builtin>
        <discovery_config>
          <discoveryProtocol>SUPER_CLIENT</discoveryProtocol>
          <discoveryServersList>
            <RemoteServer prefix="44.53.00.5f.45.50.52.4f.53.49.4d.41">
              <metatrafficUnicastLocatorList>
                <locator>
                  <udpv4>
                    <address>127.0.0.1</address>
                    <port>11811</port>
                  </udpv4>
                </locator>
              </metatrafficUnicastLocatorList>
            </RemoteServer>
          </discoveryServersList>
        </discovery_config>
      </builtin>
    </rtps>
  </participant>
</profiles>

In each terminal you open, make sure to run: source /opt/ros/rolling/setup.bash export FASTRTPS_DEFAULT_PROFILES_FILE=/fastrtps-profile.xml

Start the container by running: docker run -it --rm -v $(pwd)/fastrpts-profile.xml:/fastrtps-profile.xml osrf/ros2:nightly-rmw-nonfree

Start the discover server with: fastdds discovery --server-id 0 --ip-address 0.0.0.0 --port 11811

Run docker exec to open another terminal and first reload the ROS daemon: ros2 daemon stop; ros2 daemon start Then run: ros2 topic pub -r 5 /test std_msgs/String "{data: hi}"

In another terminal run: ros2 topic echo /test And you should see the messages coming in.

Now, comment out the <transport_id>udp_transport</transport_id> line under <userTransports>, reload the daemon, and restart the publisher and subscriber, and discovery will not work.

JesusPoderoso commented 1 year ago

Hi @sergmister Sorry for the late response. By design, the Discovery Server can be configured to use UDP or TCP transports. Shared memory transport is not an option in this case.