eclipse-zenoh / zenoh-plugin-ros2dds

A Zenoh plug-in for ROS2 with a DDS RMW. See https://discourse.ros.org/t/ros-2-alternative-middleware-report/ for the advantages of using this plugin over other DDS RMW implementations.
https://zenoh.io
Other
127 stars 29 forks source link

[Bug] zenoh_bridge_ros2dds doesn't announce a subscriber #337

Open yuma-m opened 1 week ago

yuma-m commented 1 week ago

Describe the bug

We use zenoh_bridge_ros2dds standalone executable to bridge ROS messages between two hosts (host A and host B) with the commands below. Host A and B are connected with a wired network.

Our config file is like below.

{
  plugins: {
    ros2dds: {
      allow: {
        publishers: ["/tf", "/tf_static", "topic_a", "topic_b", ... ],  # we have about 10 topics
        subscribers: ["/diagnostics", "topic_c", "topic_d", ...],  # about 20 topics
        service_servers: ["/some_service", ...],  # about 10 services
        service_clients: ["/other_service_.+/.*",  #  1 service with wildcard
        action_servers: [],
        action_clients: [],
      },
    },
  },
}

The diagnostic_aggregator node on host A subscribes to the /diagnostics topic, and multiple nodes on both host A and B publishes messages to the /diagnostics topic.

With this setup, I found an issue that Zenoh sometimes doesn't transfer the /diagnostics topic from host B to host A even if there's a subscriber on host A. When the issue happens, zenoh_bridge_ros2dds on host B doesn't subscribe to the /diagnostics topic, and the remote bridge (on the host A) doesn't announce a subscriber for the /diagnostics topic. If I manually run ros2 topic echo /diagnostics on host A, zenoh_bridge_ros2dds starts to transfer the messages from host B to A, but if I stop the ros2 topic echo command, it stops transferring.

This issue randomly happens when the ROS nodes on host A are launched. I also found the same issue happened on topics other than the /diagnostics (topic_c and topic_d in the configuration above) randomly, so I suspect Zenoh can sometimes fail to find a subscriber to specific topics depending on the launching timing.

Could you kindly guide me how to investigate this issue further or mitigate it?

To reproduce

I didn't find a simple way to reproduce the issue, but the issue sometimes happens in my environment with the configurations above. I can provide more information about the setup, logs, and so on, if you need.

System info

evshary commented 1 week ago

Hi @yuma-m

Since the issue might not be reproduced easily, it would be great if you could provide the log in zenoh-bridge-ros2dds while the issue happens (Please add RUST_LOG=z=debug while running the bridge) Some questions:

publishers: ["/tf", "/tf_static", "topic_a", "topic_b", ... ], # we have about 10 topics

I suppose you also have /diagnostics here, but just want to confirm with you.

Host B: Ubuntu 20.04 arm64

Do you build the ROS 2 Humble by yourself? I guess ROS 2 Humble doesn't support Ubuntu 20.04.

yuma-m commented 6 days ago

Hi @evshary, Thank you for your quick response. Let me clarify our configuration.

I suppose you also have /diagnostics here, but just want to confirm with you.

Actually, I have /diagnostics topic only in subscribers, because I use this configuration only for the zenoh_bridge_ros2dds on host A. zenoh_bridge_ros2dds on host B allows any topic.

Do you build the ROS 2 Humble by yourself? I guess ROS 2 Humble doesn't support Ubuntu 20.04.

Sorry, this is a bit wrong. I'm running ROS nodes and zenoh_bridge_ros2dds in a docker container. The OS of host B is Ubuntu 20.04, but the docker containers on both hosts are Ubuntu 22.04 based.

Since the issue might not be reproduced easily, it would be great if you could provide the log in zenoh-bridge-ros2dds while the issue happens (Please add RUST_LOG=z=debug while running the bridge)

It is difficult to share the whole logs as there can be confidential information, so let me extract general logs lines and lines include /diagnostics. I saved the raw log files, so I can search the logs if you want to see whether a specific message exists or not.

Logs of zenoh_bridge_ros2dds on host A

Logs right after zenoh_bridge_ros2dds is launched. At this moment, zenoh_bridge_ros2dds doesn't forward messages on /diagnostics topic, even there's a subscriber (diagnostic_aggregator node) to the topic on host A.

2024-11-20T11:18:29+09:00 - [INFO] [zenoh_bridge_ros2dds-59]: process started with pid [916]
2024-11-20T02:18:29.525232Z DEBUG tokio-runtime-worker ThreadId(07) zenoh_plugin_ros2dds: Node /navigation_planner declares Publisher /rosout: rcl_interfaces/msg/Log - Denied per config
2024-11-20T02:18:30.176945Z DEBUG                 rx-0 ThreadId(14) zenoh::net::routing::dispatcher::token: Face{2, 72be8e633d5b028b68afb317e32d29e0} Declare token 0 (@/72be8e633d5b028b68afb317e32d29e0/@ros2_lv/MP/diagnostics/diagnostic_msgs§msg§DiagnosticArray/:1::)
2024-11-20T02:18:30.176734Z DEBUG                 rx-0 ThreadId(14) zenoh::net::routing::dispatcher::resource: Register resource @/72be8e633d5b028b68afb317e32d29e0/@ros2_lv/MP/diagnostics/diagnostic_msgs§msg§DiagnosticArray/:1::
2024-11-20T02:18:32.273227Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110268af382e12b94d2b33c000001c1)
2024-11-20T02:18:32.448578Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110268af382e12b94d2b33c00000504 from Participant 0110268af382e12b94d2b33c000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:32.461705Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110268af382e12b94d2b33c00000603 from Participant 0110268af382e12b94d2b33c000001c1 on rt/rosout with type rcl_interfaces::msg::dds_::Log_ (keyless: true)
2024-11-20T02:18:32.461659Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110268af382e12b94d2b33c00000403 from Participant 0110268af382e12b94d2b33c000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:32.448821Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110268af382e12b94d2b33c00001404 from Participant 0110268af382e12b94d2b33c000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:32.468893Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110268af382e12b94d2b33c00001303 from Participant 0110268af382e12b94d2b33c000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:34.232917Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110f0e323f4cd2a4cdbed11000001c1)
2024-11-20T02:18:34.300041Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f0e323f4cd2a4cdbed1100000603 from Participant 0110f0e323f4cd2a4cdbed11000001c1 on rt/rosout with type rcl_interfaces::msg::dds_::Log_ (keyless: true)
2024-11-20T02:18:34.299992Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f0e323f4cd2a4cdbed1100000403 from Participant 0110f0e323f4cd2a4cdbed11000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:34.301096Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110f0e323f4cd2a4cdbed1100000504 from Participant 0110f0e323f4cd2a4cdbed11000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:34.301078Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f0e323f4cd2a4cdbed1100001303 from Participant 0110f0e323f4cd2a4cdbed11000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:34.307710Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110f0e323f4cd2a4cdbed1100001404 from Participant 0110f0e323f4cd2a4cdbed11000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:34.778643Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds::discovered_entities: ROS Node /tf_publisher declares a new Writer on rt/tf_static
2024-11-20T02:18:35.818354Z DEBUG tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds::ros_discovery: Publish update on 'ros_discovery_info' with 108 writers and 93 readers
2024-11-20T02:18:36.192893Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110610b2bf96af47b991657000001c1)
2024-11-20T02:18:36.241102Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110610b2bf96af47b99165700000603 from Participant 0110610b2bf96af47b991657000001c1 on rt/rosout with type rcl_interfaces::msg::dds_::Log_ (keyless: true)
2024-11-20T02:18:36.241045Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110610b2bf96af47b99165700000403 from Participant 0110610b2bf96af47b991657000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:36.241190Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110610b2bf96af47b99165700000504 from Participant 0110610b2bf96af47b991657000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:36.251943Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110610b2bf96af47b99165700001303 from Participant 0110610b2bf96af47b991657000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:36.258154Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110610b2bf96af47b99165700001404 from Participant 0110610b2bf96af47b991657000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:38.573469Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110268af382e12b94d2b33c00001503 from Participant 0110268af382e12b94d2b33c000001c1 on rt/diagnostics with type diagnostic_msgs::msg::dds_::DiagnosticArray_ (keyless: true)
2024-11-20T02:18:42.655750Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 01100310c028ac2ae80b9fdc00001303 from Participant 01100310c028ac2ae80b9fdc000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:18:42.655779Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 01100310c028ac2ae80b9fdc00000504 from Participant 01100310c028ac2ae80b9fdc000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:18:42.656026Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 01100310c028ac2ae80b9fdc00001404 from Participant 01100310c028ac2ae80b9fdc000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)

Logs after running ros2 topic echo /diagnostics on host A. zenoh_bridge_ros2dds forwards messages on /diagnostics topic while I'm running ros2 topic echo command.

2024-11-20T02:28:39.766112Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f67ef9b53b4bb922bd2600000703 from Participant 0110f67ef9b53b4bb922bd26000001c1 on rt/parameter_events with type rcl_interfaces::msg::dds_::ParameterEvent_ (keyless: true)
2024-11-20T02:28:39.745040Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110f67ef9b53b4bb922bd26000001c1)
2024-11-20T02:28:39.899334Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds::discovered_entities: ROS Node /_ros2cli_10219 declares a new Writer on rt/rosout
2024-11-20T02:28:39.899309Z  INFO tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds::discovered_entities: Discovered ROS Node /_ros2cli_10219
2024-11-20T02:28:39.899286Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds::discovery_mgr: Received ros_discovery_info from participant 0110f67ef9b53b4bb922bd26000001c1 with nodes: [/_ros2cli_10219]
2024-11-20T02:28:39.774788Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110f67ef9b53b4bb922bd2600000504 from Participant 0110f67ef9b53b4bb922bd26000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:28:39.774768Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f67ef9b53b4bb922bd2600000603 from Participant 0110f67ef9b53b4bb922bd26000001c1 on rt/rosout with type rcl_interfaces::msg::dds_::Log_ (keyless: true)
2024-11-20T02:28:39.774725Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110f67ef9b53b4bb922bd2600000403 from Participant 0110f67ef9b53b4bb922bd26000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-20T02:28:39.899356Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds: Node /_ros2cli_10219 declares Publisher /rosout: rcl_interfaces/msg/Log - Denied per config
2024-11-20T02:28:39.899340Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds::discovered_entities: ROS Node /_ros2cli_10219 declares a new Writer on rt/parameter_events
2024-11-20T02:28:39.899361Z DEBUG tokio-runtime-worker ThreadId(06) zenoh_plugin_ros2dds: Node /_ros2cli_10219 declares Publisher /parameter_events: rcl_interfaces/msg/ParameterEvent - Denied per config
2024-11-20T02:28:40.304763Z DEBUG tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds::discovery_mgr: Received ros_discovery_info from participant 0110f67ef9b53b4bb922bd26000001c1 with nodes: [/_ros2cli_10219]
2024-11-20T02:28:40.290576Z DEBUG ThreadId(12) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110f67ef9b53b4bb922bd2600000804 from Participant 0110f67ef9b53b4bb922bd26000001c1 on rt/diagnostics with type diagnostic_msgs::msg::dds_::DiagnosticArray_ (keyless: true)
2024-11-20T02:28:40.304844Z DEBUG tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds::route_subscriber: Route Subscriber (Zenoh:diagnostics -> ROS:/diagnostics) activate
2024-11-20T02:28:40.304840Z DEBUG tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds::route_subscriber: Route Subscriber (Zenoh:diagnostics -> ROS:/diagnostics) now serving local nodes {"/_ros2cli_10219"}
2024-11-20T02:28:40.304831Z  INFO tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds: Node /_ros2cli_10219 declares Subscriber /diagnostics: diagnostic_msgs/msg/DiagnosticArray - Allowed
2024-11-20T02:28:40.304789Z DEBUG tokio-runtime-worker ThreadId(08) zenoh_plugin_ros2dds::discovered_entities: ROS Node /_ros2cli_10219 declares a new Reader on rt/diagnostics
2024-11-20T02:28:40.305016Z DEBUG tokio-runtime-worker ThreadId(08) zenoh::net::routing::dispatcher::resource: Register resource @/a87c89febb0205556e67db38a53489b0/@ros2_lv/MS/diagnostics/diagnostic_msgs§msg§DiagnosticArray/:1::0,5
2024-11-20T02:28:40.304989Z DEBUG tokio-runtime-worker ThreadId(08) zenoh::net::routing::dispatcher::token: Face{1, a87c89febb0205556e67db38a53489b0} Declare token 130 (@/a87c89febb0205556e67db38a53489b0/@ros2_lv/MS/diagnostics/diagnostic_msgs§msg§DiagnosticArray/:1::0,5)
2024-11-20T02:28:40.304924Z DEBUG tokio-runtime-worker ThreadId(08) zenoh::net::routing::dispatcher::resource: Register resource diagnostics
2024-11-20T02:28:40.304903Z DEBUG tokio-runtime-worker ThreadId(08) zenoh::net::routing::dispatcher::pubsub: Face{1, a87c89febb0205556e67db38a53489b0} Declare subscriber 129 (diagnostics)

Logs of zenoh_bridge_ros2dds on host B

Logs right after zenoh_bridge_ros2dds is launched.

2024-11-20T11:18:29+09:00 - [INFO] [zenoh_bridge_ros2dds-6]: process started with pid [85]
2024-11-20T02:18:29.004303Z  INFO tokio-runtime-worker ThreadId(09) zenoh_plugin_ros2dds: ROS2 plugin Config { namespace: "/", nodename: "zenoh_bridge_ros2dds", domain: 10, ros_localhost_only: false, allowance: None, pub_max_frequencies: [], transient_local_cache_multiplier: 10, queries_timeout: None, reliable_routes_blocking: true, pub_priorities: [], work_thread_num: 2, max_block_thread_num: 50, __required__: None, __path__: None }
2024-11-20T02:18:29.004211Z  INFO main ThreadId(01) zenoh::net::runtime::orchestrator: zenohd listening scout messages on 224.0.0.224:7446
2024-11-20T02:18:29.004086Z  INFO main ThreadId(01) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/192.168.0.2:7447
2024-11-20T02:18:30.185649Z  INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds: Remote bridge a87c89febb0205556e67db38a53489b0 announces Publisher tf_static

Logs after running ros2 topic echo /diagnostics on host A.

2024-11-20T02:28:08.436883Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_256 declares Publisher /parameter_events: rcl_interfaces/msg/ParameterEvent - Allowed
2024-11-20T02:28:08.436857Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_256 declares Publisher /rosout: rcl_interfaces/msg/Log - Allowed
2024-11-20T02:28:08.436757Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds::discovered_entities: Discovered ROS Node /_ros2cli_256
2024-11-20T02:28:08.941961Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds::discovered_entities: Undiscovered ROS Node /_ros2cli_256
2024-11-20T02:28:08.842606Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_256 undeclares Publisher /rosout: rcl_interfaces/msg/Log - Allowed
2024-11-20T02:28:08.842553Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_256 undeclares Publisher /parameter_events: rcl_interfaces/msg/ParameterEvent - Allowed
2024-11-20T02:28:08.639414Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_daemon_10_85b832c30aca44568a066080a5f4acf3 declares Publisher /parameter_events: rcl_interfaces/msg/ParameterEvent - Allowed
2024-11-20T02:28:08.638812Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Node /_ros2cli_daemon_10_85b832c30aca44568a066080a5f4acf3 declares Publisher /rosout: rcl_interfaces/msg/Log - Allowed
2024-11-20T02:28:40.309089Z  INFO tokio-runtime-worker ThreadId(02) zenoh_plugin_ros2dds: Remote bridge a87c89febb0205556e67db38a53489b0 announces Subscriber diagnostics
JEnoch commented 1 day ago

Looking at your logs of zenoh_bridge_ros2dds on host A:

So it seems the issue lies in the DDS discovery between the bridge on host A and your diagnostic_aggregator Node. To investigate further, you should activate the CycloneDDS logging for discovery category on both bridge and diagnostic_aggregator Node. You can do this defining this environment variable for each: CYCLONEDDS_URI='<Tracing><Category>discovery</><Verbosity>info</><Out>stdout</></>'

If the bridge is well receiving the SEDP message (DDS discovery message) for the /diagnostics topic, you should see such log:

1732558656.302033 [0] dq.builtin: SEDP ST0 110f30f:d653004:71b94e83:1504 reliable volatile reader unnamed: (default).rt/diagnostics/diagnostic_msgs::msg::dds_::DiagnosticArray_ NEW (as udp/239.255.0.1:7401@1 udp/127.0.0.1:57630@1 ssm=0) QOS={user_data=0<>,topic_name="rt/diagnostics"   ...

If the Node is well announcing its Subscriber, you should see such log:

1732558964.004424 [0]   42929409: new_reader(guid 11053f2:86813d58:5a9a2766:1504, (default).rt/diagnostics/diagnostic_msgs::msg::dds_::DiagnosticArray_)
1732558964.004434 [0]   42929409: READER 11053f2:86813d58:5a9a2766:1504 QOS={user_data=0<>,topic_name="rt/diagnostics" ...