eProsima / Integration-Service

Apache License 2.0
66 stars 28 forks source link

ROS1 - ROS2 High latency for large message #169

Open MaxandreOgeret opened 3 years ago

MaxandreOgeret commented 3 years ago

Hello, I have set up the Integration Service to bridge ROS1 and ROS2 on my local machine.

Here is my configuration file :

systems:
  ros1: {type: ros1}
  ros2: {type: ros2}

routes:
  ros1_to_ros2: { from: ros1, to: ros2 }
  ros2_to_ros1: { from: ros2, to: ros1 }

topics:
  tmp_ros1_ros2: { type: "sensor_msgs/Temperature", route: ros1_to_ros2 }
  tmp_ros2_ros1: { type: "sensor_msgs/Temperature", route: ros2_to_ros1, ros1 : { queue_size: 10, latch: false } }
  img_ros1_ros2: { type: "sensor_msgs/Image", route: ros1_to_ros2 }
  img_ros2_ros1: { type: "sensor_msgs/Image", route: ros2_to_ros1, ros1 : { queue_size: 10, latch: false } }

And I run the integration service like so :

integration-service conf.yaml

It works well for small messages in both ways, but the latency gets very high for larger messages. (More than 2s for images)

I noticed that the integration-service executable always outputs the content of the forwarded messages. For example :

[Integration Service][INFO] [is::sh::ROS2] Sending message from Integration Service to ROS 2 for topic 'tmp_ros1_ros2': [[ Structure: <sensor_msgs/Temperature>
    [header] Structure: <std_msgs/Header>
        [stamp] Structure: <stamp>
            [sec] <int32_t>  1635431021
            [nanosec] <uint32_t>  155081033
        [frame_id] <std::string>  
    [temperature] <double>  0
    [variance] <double>  0
 ]]

I believe that this output could pose problem for larger messages. Do you know how to disable it or am I missing something ?

Thanks for the help !

gedeon1976 commented 2 years ago

Hi,

I was experimenting with SOSS/integration service with the Websocket Handle, for this case the default behavior is to publish the data from the published and subcsribed topics. It produces a lot of data printing, specially I was testing on Compressed Images image reception from image_transport ROS2 package.

From the code I saw the existence of a kind of template that is in charge of generate the messages conversion for all used messages in is-workspace/src/ROS2-SH/utils/ros2-mix-generator/resources/convert__msg.cpp.em

There you can change the default behavior but I'm not sure if is there a better alternative?

on convert__msg.cpp.em you can change the publisher and subscriber parts on code

//On Subscriber

From logger << utils::Logger::Level::INFO

            << "Received message: [[ " << data << " ]]" << std::endl; 

To logger << utils::Logger::Level::INFO

            << "Received message: [[ " " ]]" << std::endl;

//On Publisher

From

logger << utils::Logger::Level::INFO

        << "Sending message from Integration Service to ROS 2 for topic '" << _topic_name << "': " 

        << "[[ " << message << " ]]" << std::endl; 

To

logger << utils::Logger::Level::INFO

        << "Sending message from Integration Service to ROS 2 for topic '" << _topic_name << "': " 

        << "[[ "  " ]]" << std::endl; 

You need to rebuild the is-workspace packages with colcon. After performing this change the integration-service will not print the data from reception and transmission topics.

gedeon1976 commented 2 years ago

Another issue that I'm experimenting is a websocket Handle bottleneck due to the json message conversion to eprosima xtypes by the integration service. This is happening with Compressed images.

This seems to happen only when you are using the websocket handle, at least what i have been tested I have added some time measurement to the convert__msg.cpp.em template to show the duration between each received frame,

    void subscription_callback(
            const Ros2_Msg& msg)
    {
        auto start = std::chrono::steady_clock::now();
        logger << utils::Logger::Level::INFO
               << "Receiving message from ROS 2 for topic '" << _topic_name << "'" 
               << "time from last frame: " << std::chrono::duration_cast<std::chrono::milliseconds>(start - t_last).count() << "ms "
               << std::endl;

        xtypes::DynamicData data(_message_type);
        convert_to_xtype(msg, data);

        //logger << utils::Logger::Level::INFO
                //<< "Received message: [[ " << data << " ]]" << std::endl;
                //<< "Received message: [[ " " ]]" << std::endl;  // change to this to avoid data printing

        (*_callback)(data, nullptr);
        t_last = start;
    }

and added the next variable to the class

std::chrono::time_point<std::chrono::steady_clock> t_last;

For example the integration service is able to connect to the ROS2 topics with almost good processing time through a WIFI connection,

integration-service

But when I get connected from a webpage using roslibjs through a WIFI connection, an increase of 10X processing time is introduced, the video reception on the web is going very slow around 1-2 fps You can check the difference after connecting from the image below,

integration-service-websocket-sh

I perform some profiling and the issue seems to be related with the message conversion on convert_to_xtype(msg, data);

profiling

From profiling also we have,

The following functions spent a lot of time!!

On websocket::EndPoint::publish 

Websocket::JsonEncoding::encode_publication_msg() 

Eprosima::is::json_xtypes_to_json 

Eprosima::is::json_xtypes::add_json_node 

nlohmann::is::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, long, unsigned long, double, std::allocator, nlohmann::is::adl_serializer>::operator[] 

std::vector<nlohmann::is::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, long, unsigned long, double, std::allocator, nlohmann::is::adl_serializer>, std::allocator<nlohmann::is::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, long, unsigned long, double, std::allocator, nlohmann::is::adl_serializer>>>::insert 

So, I'm not sure if somebody from eProsima can check this behavior when using Websocket-SH handler?