ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
513 stars 412 forks source link

:farmer: Flaky test `test_executors_timer_cancel_behavior` #2452

Closed Crola1702 closed 3 months ago

Crola1702 commented 3 months ago

Bug report

Required Info:

Steps to reproduce issue

  1. Run a build in a windows job (preferably windows_repeated)
  2. See rclcpp.test_executors_timer_cancel_behavior test fail

Expected behavior

Test should pass

Actual behavior

rclcpp.test_executors_timer_cancel_behavior test is failing

Additional information

Reference build: https://ci.ros2.org/view/nightly/job/nightly_win_rep/3297/

Test regression:

Log output:

[ RUN      ] TestTimerCancelBehavior/MultiThreadedExecutor.testBothTimerCancelThenResetT2Behavior
C:\ci\ws\src\ros2\rclcpp\rclcpp\test\rclcpp\executors\test_executors_timer_cancel_behavior.cpp(398): error: Expected: (std::abs(t1_runs_initial - t2_runs_initial)) <= (1), actual: 2 vs 1

[  FAILED  ] TestTimerCancelBehavior/MultiThreadedExecutor.testBothTimerCancelThenResetT2Behavior, where TypeParam = rclcpp::executors::MultiThreadedExecutor (2526 ms)

Note: sometimes (26% of the times) it fails with this output. Other builds (example) contain a silent fail


Flakiness report:

job_name last_fail first_fail build_count failure_count failure_percentage
nightly_win_rep 2024-03-18 2024-03-04 16 15 93.75
nightly_win_rel 2024-03-17 2024-03-03 15 5 33.33
nightly_win_deb 2024-03-16 2024-03-09 15 3 20.0
fujitatomoya commented 3 months ago

@Crola1702 i think test simulation rate is still sensitive for windows platform, https://github.com/ros2/rclcpp/pull/2453 should be able to ease the rate. could you have it checked out?

CC: @clalancette

fujitatomoya commented 3 months ago

@Crola1702 i will leave this to you, please close this when you confirm the stability.

Crola1702 commented 3 months ago

As I mentioned in this comment, the tests are reproduced with parallel executor, and are still happening on windows nightlies. It'll be open for now

fujitatomoya commented 3 months ago

@Crola1702 https://github.com/ros2/rclcpp/pull/2458 has been merged.

Crola1702 commented 3 months ago

Closing as it doesn't seem to be happening on CI repeated jobs (run on parallel executor and --retest-until-fail 2|4) :