Closed orensbruli closed 5 months ago
I see that this test failure is occurring in the CI:
FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024
and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.
One issue I'm noticing with the new CI pipeline is that all the test_result
uploads are being written to the same file, and I think they're overwriting each other. So instead of getting a test_results
report for each failed distro we only get a report for one, and we don't know which distro it was. I think we need to tweak the pattern of this parameter to make the output path be unique per distro test.
I see that this test failure is occurring in the CI:
FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024
and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.
Investigating.
One issue I'm noticing with the new CI pipeline is that all the
test_result
uploads are being written to the same file, and I think they're overwriting each other. So instead of getting atest_results
report for each failed distro we only get a report for one, and we don't know which distro it was. I think we need to tweak the pattern of this parameter to make the output path be unique per distro test.
Thank you for pointing out this! Fixed.
I see that this test failure is occurring in the CI:
FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024
and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.
I can't either reproduce this in a local docker container.
I'm doing a step-by-step migration from the old CI to the reusable one to see where the problem comes. I'm doing it in this branch/PR: https://github.com/open-rmf/rmf_traffic/pull/93
I have found that the problem comes from setting CXX: clang++ environment variable in the ros-tooling/action-ros-ci@v0.2 that in the reusable build workflow is set by default for all the other rmf workflows
docker run -it osrf/ros:galactic-desktop-focal bash
+
cat <<EOT >> install.txt
sudo apt update && sudo apt install -y clang clang-tools lld wget python3-pip python3-colcon-coveragepy-result python3-colcon-lcov-result lcov
pip3 install cryptography==2.8
rosdep update --include-eol-distros
mkdir src
vcs import --force --recursive src/ --input https://raw.githubusercontent.com/open-rmf/rmf/main/rmf.repos
sudo apt-get update
rosdep install -r --from-paths src/rmf/ament_cmake_catch2/ament_cmake_catch2 src/rmf/rmf_utils/rmf_utils src/rmf/rmf_traffic/rmf_traffic --ignore-src --skip-keys 'rti-connext-dds-5.3.1 ' --rosdistro galactic -y
colcon mixin add default https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml
colcon mixin update default
cd src/
# THIS SHOULD NOT BE HERE. BUT WHY DOES IT MAKE IT FAILS IN TEST?!
source /opt/ros/galactic/setup.sh && colcon build --symlink-install --packages-up-to rmf_traffic --event-handlers=console_cohesion+
export CC=clang
export CXX=clang++
export QT_QPA_PLATFORM=offscreen
source /opt/ros/galactic/setup.sh && colcon build --symlink-install --packages-up-to rmf_traffic --event-handlers=console_cohesion+
source /opt/ros/galactic/setup.sh && colcon lcov-result --initial
source /opt/ros/galactic/setup.sh && colcon test --event-handlers=console_cohesion+ --return-code-on-test-failure --packages-select rmf_traffic
EOT
bash install.txt
fails with:
cat /src/build/rmf_traffic/test_results/rmf_traffic/test_rmf_traffic.catch2.xml
<testcase classname="test_rmf_traffic.global" name="Verify that FCL can handle continuous collisions" time="0.000" status="run">
<failure message="result.is_collide" type="CHECK">
FAILED:
CHECK( result.is_collide )
with expansion:
false
at /src/rmf/rmf_traffic/rmf_traffic/test/unit/fcl_test.cpp:98
</failure>
</testcase>
colcon test for rmf_traffic
is failing due to a timeout after 300s even locally. @mxgrey are you okay merging this in for now and rely on the logs to determine whether the CI failure is due to a build error or test cases timeout? I will open a ticket to look into the timeout issue.
Specifically I think it's this test
-------------------------------------------------------------------------------
Scenario: fan-in-fan-out bottleneck
Given: 2 Participants
When: Schedule:[], Negotiation:[p0(A->Z), p1(V->E)]
Then: Valid Proposal is found
-------------------------------------------------------------------------------
/home/yadunund/ws_rmf/src/rmf/rmf_traffic/rmf_traffic/test/unit/agv/test_Negotiator.cpp:2314
...............................................................................
/home/yadunund/ws_rmf/src/rmf/rmf_traffic/rmf_traffic/test/unit/agv/test_Negotiator.cpp:2314: FAILED:
due to a fatal error condition:
SIGINT - Terminal interrupt signal
Update: If I comment out that specific test case, all tests pass
yadunund@ubuntu-22-04:~/ws_rmf$ ./build/rmf_traffic/test_rmf_traffic
===============================================================================
All tests passed (57653 assertions in 87 test cases)
Modified build.yaml workflow to use reusable workflow.