osrf / docker_images

A repository to hold definitions of docker images maintained by OSRF
Apache License 2.0
575 stars 172 forks source link

`xmlrpc` ResponseError when echoing topics with `ros:humble` containers #761

Open civerachb-cpr opened 4 months ago

civerachb-cpr commented 4 months ago

I've got a Humble docker container that's raising xmlrpc.client.ResponseError errors when I try to echo topics. It doesn't make any difference if the topics are published inside or outside the container.

This started happening on July 3; prior to that topics appeared to echo correctly. I see there was a new release on or around July 2, so depending on the exact timing of when that release was made available and when I built the first image to exhibit this error it could be related? I definitely was able to echo topics properly with images made early in the morning (EDT) on July 2, but since pulling the latest ros:humble image things seem to have broken.

Exception details:

$ docker exec -it ${CONTAINER_HASH} bash
root@HOST:/# source ros_entrypoint.sh 
root@HOST:/# ros2 topic echo /test
Traceback (most recent call last):
  File "/opt/ros/humble/bin/ros2", line 33, in <module>
    sys.exit(load_entry_point('ros2cli==0.18.10', 'console_scripts', 'ros2')())
  File "/opt/ros/humble/lib/python3.10/site-packages/ros2cli/cli.py", line 91, in main
    rc = extension.main(parser=parser, args=args)
  File "/opt/ros/humble/lib/python3.10/site-packages/ros2topic/command/topic.py", line 41, in main
    return extension.main(args=args)
  File "/opt/ros/humble/lib/python3.10/site-packages/ros2topic/verb/echo.py", line 220, in main
    qos_profile = self.choose_qos(node, args)
  File "/opt/ros/humble/lib/python3.10/site-packages/ros2topic/verb/echo.py", line 146, in choose_qos
    pubs_info = node.get_publishers_info_by_topic(args.topic_name)
  File "/usr/lib/python3.10/xmlrpc/client.py", line 1122, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python3.10/xmlrpc/client.py", line 1464, in __request
    response = self.__transport.request(
  File "/usr/lib/python3.10/xmlrpc/client.py", line 1166, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python3.10/xmlrpc/client.py", line 1182, in single_request
    return self.parse_response(resp)
  File "/usr/lib/python3.10/xmlrpc/client.py", line 1348, in parse_response
    p.feed(data)
  File "/usr/lib/python3.10/xmlrpc/client.py", line 451, in feed
    self._parser.Parse(data, False)
  File "../Modules/pyexpat.c", line 416, in StartElement
  File "/usr/lib/python3.10/xmlrpc/client.py", line 689, in start
    raise ResponseError("unknown tag %r" % tag)
xmlrpc.client.ResponseError: ResponseError("unknown tag 'rclpy.type_hash.TypeHash'")

The /test topic above was generated with ros2 topic pub inside the container:

$ docker exec -it ${CONTAINER_HASH} bash
root@HOST:/# source /ros_entrypoint.sh 
root@HOST:/# ros2 topic pub /test std_msgs/String 'data: "This is a test"' -r 1
publisher: beginning loop
publishing #1: std_msgs.msg.String(data='This is a test')
...

Docker build files, and relevant files included in volumes are below:

Dockerfile:

FROM ros:humble

# Add ROS 2 testing server (create3 republisher is not on main yet)
RUN echo "deb [ signed-by=/usr/share/keyrings/ros2-latest-archive-keyring.gpg ] http://packages.ros.org/ros2-testing/ubuntu jammy main" > /etc/apt/sources.list.d/ros2-testing.list

# install ros packages
RUN apt-get update && apt-get install -y \
      ros-${ROS_DISTRO}-irobot-create-msgs \
      ros-${ROS_DISTRO}-create3-republisher \
      git \
      nano && \
    rm -rf /var/lib/apt/lists/*

I've tried another image that does not include the testing server, and it too is exhibiting the same bug when I try to echo topics inside the container. So it's not the presence of the testing server that's causing this.

build.sh

#!/bin/bash

docker build -t create3-republisher .

Pretty standard build for a container.

docker-compose.yml

services:
  create3-republisher:
    image: create3-republisher:latest
    restart: no
    env_file:
      - config/create3.env
    command: ros2 launch create3_republisher create3_republisher_launch.py robot_ns:=/ republisher_ns:=/create3_repub
    profiles:
      - republisher
    volumes:
      - ./config:/opt/config
    network_mode: host

I'm using docker compose --profile republisher up to start my container.

config/create3.env:

FASTRTPS_DEFAULT_PROFILES_FILE=/opt/config/fastrtps-repub.xml
export ROBOT_NAMESPACE=
export ROS_DOMAIN_ID=0
export ROS_DISCOVERY_SERVER=
export RMW_IMPLEMENTATION=rmw_fastrtps_cpp

config/fastrtps-repub.xml

<?xml version="1.0" encoding="UTF-8" ?>
<dds>
    <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
        <participant profile_name="turtlebot4_republisher_profile" is_default_profile="true">
            <rtps>
                <builtin>
                    <initialPeersList>
                        <locator>
                            <udpv4>
                                <address>127.0.0.1</address>
                            </udpv4>
                            <udpv4>
                                <address>192.168.186.2</address>
                            </udpv4>
                        </locator>
                    </initialPeersList>
                </builtin>
            </rtps>
        </participant>
    </profiles>
</dds>

The DDS profile is to restrict traffic from the Turtlebot4 from going anywhere other than to the Raspberry Pi the container is running on; the goal is for the Pi to be republishing all of the Create3 topics/services/actions using this container.

civerachb-cpr commented 4 months ago

Aha! I solved my own problem. For anyone in the future who finds this, the Python crash appears to be related to SHM. Modifying my docker container's DDS profile to

<?xml version="1.0" encoding="UTF-8" ?>
<dds>
    <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
        <transport_descriptors>
            <transport_descriptor>
               <transport_id>CustomUdpTransport</transport_id>
               <type>UDPv4</type>
            </transport_descriptor>
        </transport_descriptors>

        <participant profile_name="turtlebot4_republisher_profile" is_default_profile="true">
            <rtps>
                <userTransports>
                    <transport_id>CustomUdpTransport</transport_id>
                    <initialPeersList>
                        <locator>
                            <udpv4>
                                <address>127.0.0.1</address>
                            </udpv4>
                            <udpv4>
                                <address>192.168.186.2</address>
                            </udpv4>
                        </locator>
                    </initialPeersList>
                </userTransports>
                <useBuiltinTransports>false</useBuiltinTransports>
            </rtps>
        </participant>
    </profiles>
</dds>

appears to have fixed the issue; I'm now able to echo topics without crashing.


EDIT

After rebuilding the containers, it turns out the above did not actually fix the issue; I'm still seeing the exact same Python error when trying to echo a topic inside the container, even with the DDS profile above.