ros2 / rclpy

rclpy (ROS Client Library for Python)
Apache License 2.0
268 stars 221 forks source link

failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

Closed firesurfer closed 5 months ago

firesurfer commented 5 months ago

Bug report

I have a launch file where I launch a rather large amount of python nodes (~15-20). For some of those nodes I get an error like this:

[spawner-33] 1706000264.234912 [5]    spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error
[spawner-33] 
[spawner-33] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[spawner-33] This error state is being overwritten:
[spawner-33] 
[spawner-33]   'error not set, at ./src/rcl/node.c:262'
[spawner-33] 
[spawner-33] with this new error message:
[spawner-33] 
[spawner-33]   'rcl node's rmw handle is invalid, at ./src/rcl/node.c:433'
[spawner-33] 
[spawner-33] rcutils_reset_error() should be called after error handling to avoid this.
[spawner-33] <<<
[spawner-33] [ERROR] [1706000264.235045107] [rcl]: Failed to fini publisher for node: 1
[spawner-33] Traceback (most recent call last):
[spawner-33]   File "/opt/ros/iron/lib/controller_manager/spawner", line 33, in <module>
[spawner-33]     sys.exit(load_entry_point('controller-manager==3.21.2', 'console_scripts', 'spawner')())
[spawner-33]   File "/opt/ros/iron/lib/python3.10/site-packages/controller_manager/spawner.py", line 207, in main
[spawner-33]     node = Node("spawner_" + controller_names[0])
[spawner-33]   File "/opt/ros/iron/lib/python3.10/site-packages/rclpy/node.py", line 185, in __init__
[spawner-33]     self.__node = _rclpy.Node(
[spawner-33] rclpy._rclpy_pybind11.RCLError: error creating node: rcl node's rmw handle is invalid, at ./src/rcl/node.c:433

The reason I submitted this in the rclpy repository is that it only seems to happen for python nodes (perhaps because there are so many of it?) The exact nodes that fail during startup change between to runs.

Required Info:

Steps to reproduce issue

Have a launch file where you start many python nodes at once. In my case I have a lot of controller spawners from ros2control:

  servo_status_spawner = Node(
        package="controller_manager",
        executable="spawner",
        arguments=["status_controller_servo",
                   "--controller-manager", "/controller_manager"],
    )

   #And many many more 

Expected behavior

All nodes should start.

Actual behavior

For at least 4-5 Nodes I get an:

[spawner-33] 1706000264.234912 [5]    spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error

(See above for full error log)

Additional information

As said above the setup runs in a podman container. I will test it this week in a native installation. Environment settings:

export ROS_DOMAIN_ID=5
source /opt/ros/iron/setup.bash
export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
export ROS_AUTOMATIC_DISCOVERY_RANGE=LOCALHOST
clalancette commented 5 months ago

This is likely another case of https://github.com/ros2/rmw_cyclonedds/issues/458 .

firesurfer commented 5 months ago

@clalancette I can confirm this.

The solution presented in: https://github.com/ros2/rmw_cyclonedds/issues/458#issuecomment-1628823859 worked for me.

The precise I had to use the first line:

export CYCLONEDDS_URI='<CycloneDDS><Domain><Discovery><ParticipantIndex>none</ParticipantIndex></Discovery></Domain></CycloneDDS>'

The second suggestion that also enables multicast didn't work for me as I then got the error message:

[spawner-32] 1706083268.665794 [5]    spawner: selected interface "lo" is not multicast-capable: disabling multicast
[spawner-32] 1706083268.667774 [5]    spawner: Failed to find a free participant index for domain 5
clalancette commented 5 months ago

@clalancette I can confirm this.

Thanks. I'm going to close this one in favor of that one.