open-rmf / rmf_traffic

Traffic management libraries for RMF
Apache License 2.0
28 stars 26 forks source link

[CI] Reusable CI #89

Closed orensbruli closed 5 months ago

orensbruli commented 2 years ago

Modified build.yaml workflow to use reusable workflow.

mxgrey commented 2 years ago

I see that this test failure is occurring in the CI:

FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024 

and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.

One issue I'm noticing with the new CI pipeline is that all the test_result uploads are being written to the same file, and I think they're overwriting each other. So instead of getting a test_results report for each failed distro we only get a report for one, and we don't know which distro it was. I think we need to tweak the pattern of this parameter to make the output path be unique per distro test.

orensbruli commented 2 years ago

I see that this test failure is occurring in the CI:

FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024 

and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.

Investigating.

One issue I'm noticing with the new CI pipeline is that all the test_result uploads are being written to the same file, and I think they're overwriting each other. So instead of getting a test_results report for each failed distro we only get a report for one, and we don't know which distro it was. I think we need to tweak the pattern of this parameter to make the output path be unique per distro test.

Thank you for pointing out this! Fixed.

orensbruli commented 2 years ago

I see that this test failure is occurring in the CI:

FAILED: CHECK( err == Approx(0.0) ) with expansion: 5.0 == Approx( 0.0 ) at /__w/rmf_traffic/rmf_traffic/ros_ws/src/7qjy5ts7y8m/rmf_traffic/rmf_traffic/test/unit/agv/test_Planner.cpp:1024 

and that's extremely alarming since I cannot reproduce that error locally, and it's not something that should be able to change no matter the distro.

I can't either reproduce this in a local docker container.

I'm doing a step-by-step migration from the old CI to the reusable one to see where the problem comes. I'm doing it in this branch/PR: https://github.com/open-rmf/rmf_traffic/pull/93

I have found that the problem comes from setting CXX: clang++ environment variable in the ros-tooling/action-ros-ci@v0.2 that in the reusable build workflow is set by default for all the other rmf workflows

orensbruli commented 2 years ago

docker run -it osrf/ros:galactic-desktop-focal bash +

cat <<EOT >> install.txt
sudo apt update && sudo apt install -y clang clang-tools lld wget python3-pip python3-colcon-coveragepy-result python3-colcon-lcov-result lcov
pip3 install cryptography==2.8
rosdep update --include-eol-distros
mkdir src
vcs import --force --recursive src/ --input https://raw.githubusercontent.com/open-rmf/rmf/main/rmf.repos
sudo apt-get update
rosdep install -r --from-paths src/rmf/ament_cmake_catch2/ament_cmake_catch2 src/rmf/rmf_utils/rmf_utils src/rmf/rmf_traffic/rmf_traffic --ignore-src --skip-keys 'rti-connext-dds-5.3.1 ' --rosdistro galactic -y
colcon mixin add default https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml
colcon mixin update default
cd src/
# THIS SHOULD NOT BE HERE. BUT WHY DOES IT MAKE IT FAILS IN TEST?!
source /opt/ros/galactic/setup.sh && colcon build --symlink-install --packages-up-to rmf_traffic --event-handlers=console_cohesion+
export CC=clang
export CXX=clang++
export QT_QPA_PLATFORM=offscreen
source /opt/ros/galactic/setup.sh && colcon build --symlink-install --packages-up-to rmf_traffic --event-handlers=console_cohesion+
source /opt/ros/galactic/setup.sh && colcon lcov-result --initial
source /opt/ros/galactic/setup.sh && colcon test --event-handlers=console_cohesion+ --return-code-on-test-failure --packages-select rmf_traffic
EOT
bash install.txt

fails with: cat /src/build/rmf_traffic/test_results/rmf_traffic/test_rmf_traffic.catch2.xml

   <testcase classname="test_rmf_traffic.global" name="Verify that FCL can handle continuous collisions" time="0.000" status="run">
      <failure message="result.is_collide" type="CHECK">
FAILED:
  CHECK( result.is_collide )
with expansion:
  false
at /src/rmf/rmf_traffic/rmf_traffic/test/unit/fcl_test.cpp:98
      </failure>
    </testcase>
Yadunund commented 1 year ago

colcon test for rmf_traffic is failing due to a timeout after 300s even locally. @mxgrey are you okay merging this in for now and rely on the logs to determine whether the CI failure is due to a build error or test cases timeout? I will open a ticket to look into the timeout issue.

Specifically I think it's this test

-------------------------------------------------------------------------------
Scenario: fan-in-fan-out bottleneck
      Given: 2 Participants
       When: Schedule:[], Negotiation:[p0(A->Z), p1(V->E)]
       Then: Valid Proposal is found
-------------------------------------------------------------------------------
/home/yadunund/ws_rmf/src/rmf/rmf_traffic/rmf_traffic/test/unit/agv/test_Negotiator.cpp:2314
...............................................................................

/home/yadunund/ws_rmf/src/rmf/rmf_traffic/rmf_traffic/test/unit/agv/test_Negotiator.cpp:2314: FAILED:
due to a fatal error condition:
  SIGINT - Terminal interrupt signal

Update: If I comment out that specific test case, all tests pass

 yadunund@ubuntu-22-04:~/ws_rmf$ ./build/rmf_traffic/test_rmf_traffic 
===============================================================================
All tests passed (57653 assertions in 87 test cases)