ros2 / rmw_fastrtps

Implementation of the ROS Middleware (rmw) Interface using eProsima's Fast RTPS.
Apache License 2.0
155 stars 117 forks source link

ros2cli tools for topics, services and actions not functional when using Discovery Server #499

Open mbuijs opened 3 years ago

mbuijs commented 3 years ago

Bug report

Required Info:

Steps to reproduce issue

Following the tutorial about the Fast DDS discovery server on ROS Index: https://index.ros.org/doc/ros2/Tutorials/Discovery-Server/Discovery-Server/.

Terminal 1:

fastdds discovery --server-id 0

Terminal 2:

export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 run demo_nodes_cpp listener

Terminal 3:

export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 run demo_nodes_cpp talker

Communication between talker and listener is working at this point. Now open one more terminal and use some ros2cli tools to inspect what is going on

export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 topic list
ros2 node info /talker
ros2 topic info /chatter
ros2 topic echo /chatter

Expected behavior

This is the output when running without ros2 topic list and ros2 node info /talker without any additional configuration:

$ ros2 topic list
/chatter
/parameter_events
/rosout
$ ros2 node info /talker
/talker
  Subscribers:
    /parameter_events: rcl_interfaces/msg/ParameterEvent
  Publishers:
    /chatter: std_msgs/msg/String
    /parameter_events: rcl_interfaces/msg/ParameterEvent
    /rosout: rcl_interfaces/msg/Log
  Service Servers:
    /talker/describe_parameters: rcl_interfaces/srv/DescribeParameters
    /talker/get_parameter_types: rcl_interfaces/srv/GetParameterTypes
    /talker/get_parameters: rcl_interfaces/srv/GetParameters
    /talker/list_parameters: rcl_interfaces/srv/ListParameters
    /talker/set_parameters: rcl_interfaces/srv/SetParameters
    /talker/set_parameters_atomically: rcl_interfaces/srv/SetParametersAtomically
  Service Clients:

  Action Servers:

  Action Clients:

$ ros2 topic info /chatter
Type: std_msgs/msg/String
Publisher count: 1
Subscription count: 1
$ ros2 topic echo /chatter                                                                                                                                                                  
data: 'Hello World: 5'                                                                                                                                                                                             
---                                                                                                                                                                                                                
data: 'Hello World: 6'                                                                                                                                                                                             
---                     

Actual behavior

Note specifically the missing /chatter topic, but also the parameter services.

$ ros2 topic list
/parameter_events
/rosout
$ ros2 node info /talker
/talker
  Subscribers:
    /parameter_events: rcl_interfaces/msg/ParameterEvent
  Publishers:

  Service Servers:

  Service Clients:

  Action Servers:

  Action Clients:

$ ros2 topic info /chatter
Unknown topic '/chatter'
$ ros2 topic echo /chatter
WARNING: topic [/chatter] does not appear to be published yet
Could not determine the type for the passed topic
fujitatomoya commented 3 years ago

i confirmed the same problem with ros2:rolling. (restarting daemon does not work either)

This version uses the topic of the different nodes to decide if two nodes wish to communicate, or if they can be left unmatched (i.e. not discovering each other)

it seems to be because of this feature. CC: @MiguelCompany @richiware

jparisu commented 3 years ago

Hi @mbuijs,

Thank you for your comment and the time you spent on it, it was really helpful to find the problem.

This is something we have been concerned about. This failure raises from the use of ROS 2 CLI as a Daemon. Summarizing, ROS 2 Daemon must be pre-configured in order to use the Discovery-Server discovery protocol. As @fujitatomoya has correctly pointed out, the Discovery Server, in order to reduce traffic network, avoids to send topic information to entities that "do not need it". In practice, what we are seeing here is that the Daemon has been configured as a CLIENT without user endpoints. Thus, this participant (daemon node) will receive information from shared topics (such as /parameter_events and /rosout), but it will not discover /chatter as it has no endpoints that publish or subscribe to it.

There is a work-around that easily avoids this problem (in Foxy and Rolling), based on executing the ROS 2 Daemon with a specific configuration from a config file. We create the Daemon as a SERVER, and so it will receive information from every topic in the network.

This is the Fast DDS configuration file for the ROS 2 Daemon:

<?xml version="1.0" encoding="UTF-8" ?>
<dds>
    <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
        <participant profile_name="server_profile" is_default_profile="true">
            <rtps>
                <prefix>44.49.53.43.53.45.52.56.45.52.5f.30</prefix>
                <builtin>
                    <discovery_config>
                        <discoveryProtocol>SERVER</discoveryProtocol>
                    </discovery_config>
                    <metatrafficUnicastLocatorList>
                        <locator>
                            <udpv4>
                                <address>127.0.0.1</address>
                                <port>11811</port>
                            </udpv4>
                        </locator>
                    </metatrafficUnicastLocatorList>
                </builtin>
            </rtps>
        </participant>
    </profiles>
</dds>

This file could be set as default configuration file with the env var FASTRTPS_DEFAULT_PROFILES_FILE, or could be in the working directory and be called DEFAULT_FASTRTPS_PROFILES.xml to automatically load it. Notice that only the Daemon must load this file (if other node tries to load it a second time, the participant creation will fail). Hence, we recommend to use the first option.

Executing these commands in 4 different terminals will produce the expected behavior.

T1 - DAEMON + SERVER

source <ros2_installation>/install/setup.bash
export FASTRTPS_DEFAULT_PROFILES_FILE=daemon_fastdds_config.xml
ros2 daemon stop
ros2 daemon start

T2 - LISTENER

source <ros2_installation>/install/setup.bash
export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 run demo_nodes_cpp listener

T3 - TALKER

source <ros2_installation>/install/setup.bash
export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 run demo_nodes_cpp talker

T4 - ROS2 INTROSPECTION (could be T1 unsetting FASTRTPS_DEFAULT_PROFILES_FILE)

source <ros2_installation>/install/setup.bash
export ROS_DISCOVERY_SERVER=127.0.0.1:11811
ros2 topic list
ros2 node info /talker
ros2 topic info /chatter
ros2 topic echo /chatter

This solution will be added to the ROS2 Index Discovery Server tutorial as soon as possible. We are currently working on a more sophisticated solution for this problem. However this is tricky, as the ROS2 Daemon could be running before the Discovery Server is configured, and so their behaviour will not match.

mbuijs commented 3 years ago

Thanks for your quick and thorough response, I can confirm that your suggestion solves the problem.

One more thing though (this possibly needs it own issue?): I noticed when running the listener/talker examples that listener always misses the first 2 messages from talker. When publishing a single message to listener, this means that the message is not delivered at all. According to the output of ros2 topic pub in terminal 2 shown below, there were 4 messages published containing test, but only 2 of them were received by the listener.

Terminal 1:

$ ros2 run demo_nodes_cpp listener
[INFO] [1610106452.563921231] [listener]: I heard: [Hello World: 3]
[INFO] [1610106453.563798334] [listener]: I heard: [Hello World: 4]
[..]
[INFO] [1610106491.563268384] [listener]: I heard: [Hello World: 42]                                                                                                                                               
[INFO] [1610106492.148479127] [listener]: I heard: [test]                                                                                                                                                           
[INFO] [1610106492.563372880] [listener]: I heard: [Hello World: 43]
[INFO] [1610106493.148563285] [listener]: I heard: [test]
[INFO] [1610106493.563716349] [listener]: I heard: [Hello World: 44]
[INFO] [1610106494.563517463] [listener]: I heard: [Hello World: 45]

Terminal 2:

$ ros2 topic pub /chatter std_msgs/msg/String "data: 'test'"
publisher: beginning loop
publishing #1: std_msgs.msg.String(data='test')

publishing #2: std_msgs.msg.String(data='test')

publishing #3: std_msgs.msg.String(data='test')

^C
$ ros2 topic pub --once /chatter std_msgs/msg/String "data: 'test'"
publisher: beginning loop
publishing #1: std_msgs.msg.String(data='test')

Is this due to some misconfiguration on my side perhaps?

fujitatomoya commented 3 years ago

@jparisu

work-around works okay, thanks for the quick info.

there is a couple of things i would like to mention about this work-around, since it changes user experience.

root@a6cfc30c581b:~/docker_ws/ros2_colcon# ros2 topic list --no-daemon
/parameter_events
/rosout
root@a6cfc30c581b:~/docker_ws/ros2_colcon# ros2 topic list
/chatter  ### THIS IS MISSING
/parameter_events
/rosout

I understand that there is always trading-off. but at least, i think that these are clearly described in the documentation.

@mbuijs

One more thing though (this possibly needs it own issue?): I noticed when running the listener/talker examples that listener always misses the first 2 messages from talker. When publishing a single message to listener, this means that the message is not delivered at all.

this does not happen to me.

jparisu commented 3 years ago

Hi @mbuijs , thanks for your feedback, I am glad to hear that is working properly already.

About the second issue, I am sorry to tell that I could not replicate it in my computer. Nevertheless, it is a common behavior due to DDS connectivity. The Talker Node in ROS2 sends the data before matching is complete (you can see the messages been sent from a talker without a listener), so you cannot be sure that the first message will arrive the destination before the discovery information (listener node will discard any info that arrived from a non matched participant). This could be seen in busy networks or slow computers. And, as the Discovery Server discovery protocol adds an extra discovery step in the middle (talker_node -> server -> listener_node) is most likely to lose first messages (once the matching is done, the server does not add any additional steps regarding DDS standard mode, it only affects the discovery phase).

jparisu commented 3 years ago

@fujitatomoya

Regarding these two issues.

  1. There is a way to use --no-daemon option with the Discovery Server. Is a similar work-around as the one used before. Using a new configuration file, we can execute the ROS2 introspection from a Server different from the one we are creating for the daemon. Thus, our server will receive all the topic data, and so we will see every topic. This is an example of the new configuration file:
    <?xml version="1.0" encoding="UTF-8" ?>
    <dds>
    <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
        <participant profile_name="server_profile" is_default_profile="true">
            <rtps>
                <prefix>72.61.73.70.66.61.72.6d.74.65.73.74</prefix>
                <builtin>
                    <discovery_config>
                        <discoveryProtocol>SERVER</discoveryProtocol>
                        <discoveryServersList>
                            <RemoteServer prefix="44.49.53.43.53.45.52.56.45.52.5f.30">
                                <metatrafficUnicastLocatorList>
                                    <locator>
                                        <udpv4>
                                            <address>127.0.0.1</address>
                                            <port>11811</port>
                                        </udpv4>
                                    </locator>
                                </metatrafficUnicastLocatorList>
                            </RemoteServer>
                        </discoveryServersList>
                    </discovery_config>
                    <metatrafficUnicastLocatorList>
                        <locator>
                            <udpv4>
                                <address>127.0.0.1</address>
                                <port>12345</port>
                            </udpv4>
                        </locator>
                    </metatrafficUnicastLocatorList>
                </builtin>
            </rtps>
        </participant>
    </profiles>
    </dds>

    Summarizing: it creates a participant that is also a Server and it connects to the already running Server.

The commands to run it would be:

source <ros2_installation>/install/setup.bash
export FASTRTPS_DEFAULT_PROFILES_FILE=<wd>/fastdds_config_auxiliar_server.xml
ros2 topic list --no-daemon

Be aware that this approach is valid even if the daemon is not running with the Discovery Server configuration. But it needs at least one Server running to have connectivity.

  1. We are concerned about this issue, as it is very unlikely to achieve a proper solution that allows to use the daemon with and without the Discovery Server at the same time (without changing the daemon implementation). By design, there is not a use case where Simple and Discovery Server must coexist, so there is no way to merge them. Nevertheless, the introspection could be run without daemon, so just configuring the env var correctly to use (or not use) the Discovery Server discovery protocol, it should be enough to introspect every system running.
hidmic commented 3 years ago

@jparisu please, let me know when this valuable pieces of documentation are in a public and visible place so I can close this ticket.

jparisu commented 3 years ago

Hi @hidmic .

This new version is pending of review in Fast-DDS documentation (https://github.com/eProsima/Fast-RTPS-docs/pull/220) where we maintain a copy of the Discovery Server Tutorial for ROS 2. Once it is merged, we will open the PullRequest to ros2_documentation and I will notice you.

Feel free to add or comment anything in the actual or future PR.

jparisu commented 3 years ago

I am glad to announce you that this explanation is merged into our Fast DDS documentation (https://fast-dds.docs.eprosima.com/en/latest/fastdds/ros2/discovery_server/ros2_discovery_server.html#ros-2-introspection) and it is expecting of review in ROS 2 Documentation (https://github.com/ros2/ros2_documentation/pull/1028).

Please, further comments or corrections will be welcome in this new PR. And feel free to close this issue whenever you consider appropriate.

jparisu commented 3 years ago

Regarding the new features for Fast-DDS Discovery Server that I mentioned to solve this problem, a new PR has been sent to Fast-DDS: https://github.com/eProsima/Fast-DDS/pull/1763 .

In this new feature, a new Participant Type has been included to the Discovery Server implementation called SUPER_CLIENT. This new participant will be able to receive all the data from the server or servers it is connected, but do not work as a server itself (kind of a promiscuous participant). This will facilitate the user to create a ROS 2 Daemon that will know all the network, and avoiding the extra configurations.

Together with this feature, we are working in the Discovery Server API and user experience (modifying the environment variable behaviour, adding user case features, fixing minor bugs). Because of this, I will set as Draft the PR open in ros2_documentation till the whole implementation is done, reviewed and merged, and the Discovery Server documentation has been updated will the new features.