ros / dynamic_reconfigure

BSD 3-Clause "New" or "Revised" License
47 stars 112 forks source link

Parameter description messages serialising to incorrect length on Arm #176

Open ashnap123 opened 3 years ago

ashnap123 commented 3 years ago

Issue

On serialising the parameter description message for custom configs the serialised message is much larger than expected causing network saturation everytime a client connects. This issue only seems to be present when a config has multiple parameters of the same type and the server is run on ARM in Release.

Language Arch Build Serialised Bytes Deserialised Message
C++ X86 Release 335B Valid
C++ ARM Release 660MB Valid
C++ ARM Debug 355B Valid

Steps To Recreate

ROS Distro - noetic

A test package which can be used to recreate this message is provided here: https://github.com/ashnap123/dynamic_reconfigure_test

The following is the simplest way to recreate the issue, must be run on ARM. This is included as a test in the above package.

catkin_make run_tests -DCMAKE_BUILD_TYPE=Release dynamic_test

TEST(ParameterDescriptionSerialsation, test_serialisation_multiple_parameters) {
    auto description = dynamic_test::ExampleBrokenConfig::__getDescriptionMessage__();
    auto serialisationLength = ros::serialization::serializationLength(description);
    EXPECT_LT(serialisationLength, 1024);
}

Also on running the tests for the dynamic_reconfigure package on ARM in Release the following warnings are generated, which is assumed to be the same issue:

[ERROR] [1621350696.145575854]: a message of over a gigabyte was predicted in tcpros. that seems highly unlikely, so I'll assume protocol synchronization is lost.
venabled commented 3 years ago

Just for reference, we've seen what I'm assuming is the results of this, in production, on arm64

Ecophagy commented 2 years ago

This looks like it could be a gcc compiler issue, the same as https://github.com/ros/ros_comm/issues/2197 & https://github.com/ros/roscpp_core/issues/130

Work around would therefore be to upgrade gcc version or compile not in release mode (using -O2 instead of the implicit -O3)

peci1 commented 1 year ago

PR https://github.com/ros/roscpp_core/pull/136 should fix this. I'm looking for someone who could verify. Just please notice that Focal now has GCC 9.4 by default where I could not reproduce the issue. So the test would need to be done with GCC 9.3 installed explicitly and dynamic_reconfigure built from source.