Closed Blast545 closed 1 year ago
I'm unable to reproduce this issue on Windows.
It appeared again today: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2525/
Flakiness ratio:
Log output:
@Crola1702 Given that I believe that this is fixed, I'm going to close this out. If it happens again, please feel free to reopen and I'll take another look.
@Crola1702 Is this the one related to a hanging job in the affected nodes? would you please paste here the internal issue tracking that?
Well... It happened today again in Nightly Windows Release 2489
Error message
launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched.test_nodes_successful failed
Stacktrace
FAIL: launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched.test_nodes_successful
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "C:\ci\ws\src\ros2\examples\launch_testing\launch_testing_examples\launch_testing_examples\check_multiple_nodes_launch_test.py", line 57, in test_nodes_successful
assert wait_for_nodes_1.wait()
AssertionError: assert False
+ where False = <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x000001F3A9939CD0>>()
+ where <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x000001F3A9939CD0>> = <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x000001F3A9939CD0>.wait
Stderr call:
---------------------------- Captured stderr call -----------------------------
test_node_does_not_exist (launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched)
Insert a invalid node name that should not exist. ... ok
test_nodes_successful (launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched)
Check if all the nodes were launched correctly. ... FAIL
Last 10 times it happened:
Link | Datetime |
---|---|
https://ci.ros2.org/view/nightly/job/nightly_win_rel/2489 | 2022-11-10 23:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2796 | 2022-11-05 23:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2795 | 2022-11-04 23:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2793 | 2022-11-02 23:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2791 | 2022-11-01 04:00:01 |
https://build.ros2.org/view/Hci/job/Hci__nightly-connext_ubuntu_jammy_amd64/59/ | 2022-10-30 06:15:02 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2788 | 2022-10-29 04:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2786 | 2022-10-27 04:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_deb/2525 | 2022-10-25 04:00:01 |
https://ci.ros2.org/view/nightly/job/nightly_win_rep/2784 | 2022-10-25 04:00:01 |
FYI: @clalancette
In the case of https://ci.ros2.org/view/nightly/job/nightly_win_deb/2525, this error appeared because there was a hanging node in the agent causing this problem to appear.
So, it seems this errors shows up once/twice every two weeks in the nightly_win_rel / nightly_win_deb jobs. However, it's showing up frequently in the repeated jobs.
This happened today in Windows Debug
Line of code failing: check_multiple_nodes_launch_test.py#L57
Reference build: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2603/
Log output:
FAIL: launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched.test_nodes_successful
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "C:\ci\ws\src\ros2\examples\launch_testing\launch_testing_examples\launch_testing_examples\check_multiple_nodes_launch_test.py", line 57, in test_nodes_successful
assert wait_for_nodes_1.wait()
AssertionError: assert False
+ where False = <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000010A9E433780>>()
+ where <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000010A9E433780>> = <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000010A9E433780>.wait
Today we had a different output for Windows debug:
Reference build: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2617/
================================================================================================================================================================================================================================================================
FAIL: launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched.test_node_does_not_exist
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "C:\ci\ws\src\ros2\examples\launch_testing\launch_testing_examples\launch_testing_examples\check_multiple_nodes_launch_test.py", line 73, in test_node_does_not_exist
assert wait_for_nodes_1.get_nodes_not_found() == {'invalid_node'}
AssertionError: assert {'demo_node_2...invalid_node'} == {'invalid_node'}
Extra items in the left set:
'demo_node_2'
Use -v to get more diff
================================================================================================================================================================================================================================================================
FAIL: launch_testing_examples.check_multiple_nodes_launch_test.CheckMultipleNodesLaunched.test_nodes_successful
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "C:\ci\ws\src\ros2\examples\launch_testing\launch_testing_examples\launch_testing_examples\check_multiple_nodes_launch_test.py", line 57, in test_nodes_successful
assert wait_for_nodes_1.wait()
AssertionError: assert False
+ where False = <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000021A479A51E0>>()
+ where <bound method WaitForNodes.wait of <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000021A479A51E0>> = <launch_testing_examples.check_multiple_nodes_launch_test.WaitForNodes object at 0x0000021A479A51E0>.wait
Isn't the extra demo_node_2
related to a problem with the host, the problem where a node remains open after an error in a job?
This was theoretically solved by #355 . I'm going to close this out, but if this reoccurs please feel free to reopen.
Bug report
Required Info:
Steps to reproduce issue
This is a flaky test. It would fail sometimes when running the buildfarm jobs. https://ci.ros2.org/view/nightly/job/nightly_win_deb/2331/testReport/junit/launch_testing_examples.launch_testing_examples/check_multiple_nodes_launch_test/launch_testing_examples_check_multiple_nodes_launch_test/
https://github.com/ros2/examples/blob/master/launch_testing/launch_testing_examples/launch_testing_examples/check_multiple_nodes_launch_test.py
There's a "TO-DO" that may require being addressed: https://github.com/ros2/examples/blob/7d6f14a55dd3d6498fd7a8d5a90cd482ec71970c/launch_testing/launch_testing_examples/launch_testing_examples/check_multiple_nodes_launch_test.py#L83
Expected behavior
Test would pass.
Actual behavior
Test fails.
Additional information
This error failed for the first time in the nightly_win_deb farm: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2272/ and has failed 6times counting that.
This is the first time the error showed in the repeated jobs: https://ci.ros2.org/view/nightly/job/nightly_win_rep/2505 and has failed 46 times since then. Pretty reliable in the repeated jobs.