Closed e10harvey closed 3 years ago
The fact this test fails is a sign that this MPI implementation is not behaving the way that MPI is supposed to behave. We should just disable this test for this the 'ats2' builds.
If MPI is, indeed, a problem on this platform, this test has done a good job in detecting it. Thus, I am reluctant to disable the test. We need someone to work with the platform team about getting MPi working correctly.
@e10harvey, @kddevin,
This test is not failing because it is printing that the unit test itself failed. It is failing because the unit tests are not printing that the expected unit test failure output is occurring. You can see that here showing:
================================================================================
TEST_0
Show default output on proc 0
Running: "/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/Trilinos/cmake/std/atdm/ats2/trilinos_jsrun" "-p" "4" "--rs_per_socket" "4" "/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe" "--output-show-proc-rank" "--teuchos-suppress-startup-banner"
--------------------------------------------------------------------------------
WARNING, you have not set TPETRA_ASSUME_CUDA_AWARE_MPI=0 or 1, defaulting to TPETRA_ASSUME_CUDA_AWARE_MPI=0
BEFORE: jsrun '-p' '4' '--rs_per_socket' '4' '/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe' '--output-show-proc-rank' '--teuchos-suppress-startup-banner'
AFTER: export TPETRA_ASSUME_CUDA_AWARE_MPI=0; jsrun '-p' '4' '--rs_per_socket' '4' '/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe' '--output-show-proc-rank' '--teuchos-suppress-startup-banner'
jsrun return value: 1
--------------------------------------------------------------------------------
TEST_0: Return code = 1
TEST_0: Pass criteria = Match REGEX {NOTE: Global reduction shows failures on other processes} [FAILED]
TEST_0: Pass criteria = Match REGEX {NOTE: Unit test failed on processes = {1, 3}} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootFails_UnitTest} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsIntExcept_UnitTest} [FAILED]
TEST_0: Result = FAILED
================================================================================
Note the TEST_0
case pass criteria:
TEST_0: Return code = 1
TEST_0: Pass criteria = Match REGEX {NOTE: Global reduction shows failures on other processes} [FAILED]
TEST_0: Pass criteria = Match REGEX {NOTE: Unit test failed on processes = {1, 3}} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootFails_UnitTest} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest} [FAILED]
TEST_0: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsIntExcept_UnitTest} [FAILED]
TEST_0: Result = FAILED
That means that TEST_0
case failed because the unit test executable did not print FAILED
with the expected regexes.
You can see that the next test case TEST_1
is actually passing because it prints the inner unit test FAILED
errors you showed above showing:
================================================================================
TEST_1
Show output on proc 1
Running: "/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/Trilinos/cmake/std/atdm/ats2/trilinos_jsrun" "-p" "4" "--rs_per_socket" "4" "/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe" "--output-show-proc-rank" "--output-to-root-rank-only=1" "--teuchos-suppress-startup-banner"
--------------------------------------------------------------------------------
WARNING, you have not set TPETRA_ASSUME_CUDA_AWARE_MPI=0 or 1, defaulting to TPETRA_ASSUME_CUDA_AWARE_MPI=0
BEFORE: jsrun '-p' '4' '--rs_per_socket' '4' '/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe' '--output-show-proc-rank' '--output-to-root-rank-only=1' '--teuchos-suppress-startup-banner'
AFTER: export TPETRA_ASSUME_CUDA_AWARE_MPI=0; jsrun '-p' '4' '--rs_per_socket' '4' '/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe' '--output-show-proc-rank' '--output-to-root-rank-only=1' '--teuchos-suppress-startup-banner'
p=1 |
p=1 | ***
p=1 | *** Unit test suite ...
p=1 | ***
p=1 |
p=1 |
p=1 | Sorting tests by group name then by the order they were added ... (time = 8.84e-07)
p=1 |
p=1 | Running unit tests ...
p=1 |
p=1 | 0. UnitTestHarness_nonRootFails_UnitTest ...
p=1 | Pass on even procs but fail on other procs!
p=1 | procRank%2 = 1 == 0 : FAILED ==> /vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/test/UnitTesting/UnitTestHarness_Parallel_UnitTests.cpp:57
p=1 | NOTE: Unit test failed on processes = {1, 3}
p=1 | (rerun with --output-to-root-rank-only=<procID> to see output
p=1 | from individual processes where the unit test is failing!)
p=1 | [FAILED] (0.000132 sec) UnitTestHarness_nonRootFails_UnitTest
p=1 | Location: /vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/test/UnitTesting/UnitTestHarness_Parallel_UnitTests.cpp:53
p=1 |
p=1 | 1. UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest ...
p=1 | Pass on even procs but throws Teuchos exception on other processes!
p=1 |
p=1 | p=1: *** Caught standard std::exception of type 'std::out_of_range' :
p=1 |
p=1 | /vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/test/UnitTesting/UnitTestHarness_Parallel_UnitTests.cpp:65:
p=1 |
p=1 | Throw number = 1
p=1 |
p=1 | Throw test that evaluated to true: (procRank%2) != (0)
p=1 |
p=1 | Error, (procRank%2 = 1) != (0 = 0)!
p=1 | NOTE: Unit test failed on processes = {1, 3}
p=1 | (rerun with --output-to-root-rank-only=<procID> to see output
p=1 | from individual processes where the unit test is failing!)
p=1 | [FAILED] (0.000175 sec) UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest
p=1 | Location: /vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/test/UnitTesting/UnitTestHarness_Parallel_UnitTests.cpp:61
p=1 |
p=1 | 2. UnitTestHarness_nonRootThrowsIntExcept_UnitTest ...
p=1 | Pass on even procs but throws int exception on other processes!
p=1 |
p=1 | p=1: *** Caught an integer exception with value = 1
p=1 | NOTE: Unit test failed on processes = {1, 3}
p=1 | (rerun with --output-to-root-rank-only=<procID> to see output
p=1 | from individual processes where the unit test is failing!)
p=1 | [FAILED] (2.64e-05 sec) UnitTestHarness_nonRootThrowsIntExcept_UnitTest
p=1 | Location: /vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/test/UnitTesting/UnitTestHarness_Parallel_UnitTests.cpp:71
p=1 |
p=1 |
p=1 | The following tests FAILED:
p=1 | 0. UnitTestHarness_nonRootFails_UnitTest ...
p=1 | 1. UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest ...
p=1 | 2. UnitTestHarness_nonRootThrowsIntExcept_UnitTest ...
p=1 |
p=1 | Total Time: 0.000448 sec
p=1 |
p=1 | Summary: total = 3, run = 3, passed = 0, failed = 3
p=1 |
p=1 | End Result: TEST FAILED
jsrun return value: 1
--------------------------------------------------------------------------------
TEST_1: Return code = 1
TEST_1: Pass criteria = Match REGEX {NOTE: Unit test failed on processes = {1, 3}} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootFails_UnitTest} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsIntExcept_UnitTest} [PASSED]
TEST_1: Result = PASSED
================================================================================
Note the TEST_1
pass/fail criteria:
TEST_1: Return code = 1
TEST_1: Pass criteria = Match REGEX {NOTE: Unit test failed on processes = {1, 3}} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootFails_UnitTest} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsTeuchosExcept_UnitTest} [PASSED]
TEST_1: Pass criteria = Match REGEX {.*FAILED.* UnitTestHarness_nonRootThrowsIntExcept_UnitTest} [PASSED]
TEST_1: Result = PASSED
See, TEST_1
case passed because the inner unit tests printed FAILED
!
There are an inner and outer tests running here. We run unit tests that we expect to fail in the inner mpirun commands and our outer test driver checks to make sure they fail as expected. See the definition of these tests at:
and TRIBITS_ADD_ADVANCED_TEST()
.
Does that make sense? (I will update the "Description" field for this)
My guess is that what is happening here is that this Spectrum MPI implementation on ATS-2 is sometimes not printing STDOUT from other MPI ranks on the root proc (0) when other ranks print and then fail. This is either a bug in Spectrum MPI or the MPI standard does not guarantee that STDOUT from ranks that fail will get printed on the root proc (0). But I suspect it is the former; that is, we have seen cases were MPI processes fail to run and don't print any output. Therefore, this might just be related to the many other ATS-2 specific errors we have been seeing shown here.
So I am going to mark this as an ATDM Env issue
as this is not really a bug in Teuchos or in this test. Given all of the other problems with ATS-2 (some of which are shown here), I doubt anyone will put much effort into fixing an issue like this. (Just look how long it took them to address #6861 which really was impacting a lot of real users and they never really 100% solved that problem.)
I added the label Stalled
and moved to the phase Stuck
since there is really nothing reasonable that can be done to make sure this test never fails like this (because it is an env issue that we can't control).
The only suggestion I would have is to filter out these failures in the big --cdash-nonpassed-tests-filters=<filder-fields>
argument here:
But I am not sure it is worth making that query that much longer just for this one test (and that suggests doing the same for a bunch of other similar failures which is not very scalable).
What would be best would be to extend the cdash_analyze_and_report.py
tool to add an expected_fail_regex
field and an allow_to_fail
field to the tests-with-issue-trackers CSV file and then allow this test to fail when the output matches his specific regex:
(Pass criteria = Match REGEX {NOTE: Unit test failed on processes = [{]1[}]} .*FAILED|Pass criteria = Match REGEX {NOTE: Unit test failed on processes = [{]1, 3[}]} .*FAILED|)
which is used in the query above but that will take some work.
See allow_to_fail
and expected_fail_regex
in:
@e10harvey, FYI, I updated the
atdmTrilinosTestsWithIssueTrackers.csv
file for this adding the extra matching tests.
Tests with issue trackers Passed: twip=11
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=13
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=13
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=13
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Looks like this test has been passing a lot lately. Should we go ahead and close this?
Looks like this test has been passing a lot lately. Should we go ahead and close this?
@jwillenbring, no it is still randomly failing, failing 3 times since Feb 14 as shown in this query and you can see these 3 failures above in the last 30 days (pay attention to the "Non-pass Last 30 Days" column).
I think we may need to get together and discuss how to address randomly failing tests with these Grover updates because the fact that you thought that this is not failing recently is a problem. The solution is likely requires adding a fail_regularity
field with associated logic (see TrilinosATDMStatus/TODO.txt), and upgrading Grover to give specific recommendations if an issue tracker can be closed or not as mentioned in the "Tasks" list of #3887. And we likely need to add something like https://github.com/TriBITSPub/TriBITS/issues/349 but for tests with issue trackers to make it more explicit that there have been test failures in the last X days. In fact, we should discuss #3887 in detail because the Grover currently implemented was the the minimum viable product and does not really solve the problem well enough (as evidenced by the above comment).
As for this GitHub Issue, we should likely just disable this test for these builds because the chance they will fix MPI on these platforms to address this is about zero.
@e10harvey, I think one problem here is that the title for this issue and the "Description" field did not emphasize that these tests are randomly failing. That is the key thing that @jwillenbring did not realize when he posted above. I fixed that text just now. Also, you might consider adding a GitHub Issue label called something like test: randomly failing
to make that extra clear.
I think the tool create_trilinos_github_test_failure_issue.py can be extended (by extending tribits/ci_support/CreateIssueTrackerFromCDashQuery.py) to automatically determine if tests are randomly failing or not and make that clear in the description field (and in a template for the summary field).
Related to my epics SEPW-213 and SEPW-215.
The recent failures are on skybridge and chama all have the following form (missing shared libraries), as if the configuration or run environment is not complete. True?
/gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe)
/gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe)
/gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe)
/gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /gpfs1/jenkins/chama-slave/workspace/Trilinos-atdm-tlcc2-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/teuchos/comm/test/UnitTesting/TeuchosComm_UnitTestHarness_Parallel_UnitTests.exe)
srun: error: chama170: task 0: Exited with exit code 1
The recent failures are on skybridge and chama
@kddevin, what errors? Can you be specific and provide a CDash query URL?
@bartlettroscoe I followed the "this query" link that you posted two hours ago.
@bartlettroscoe I followed the "this query" link that you posted two hours ago.
I am not able to get to the CDash sites to look at this again. Something appears to be wrong.
The recent failures are on skybridge and chama all have the following form (missing shared libraries), as if the configuration or run environment is not complete. True?
@kddevin, you are correct. That was due to build problems for that 'tlcc2' configuration that have nothing to due with this issue. That CDash query should not have picked those up 'tlcc2' failures. I will have to get with Kitware and see why those were listed in that query because only tests that match the regex:
(Pass criteria = Match REGEX {NOTE: Unit test failed on processes = [{]1[}]} .*FAILED|Pass criteria = Match REGEX {NOTE: Unit test failed on processes = [{]1, 3[}]} .*FAILED|)
should be displayed.
But we still see 3 of the failures reported in this issue as shown in this query.
I figured out the problem with the regexes in the query. I fixed the queries at the top and above. This query now shows the correct 3 random failures since Feb 14 and naturally does not show any of the 'tlcc2' failures (which are unrelated). Very tricky is CDash.
Tests with issue trackers Passed: twip=13
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Failed | Completed (Failed) | 2 | 2 | 26 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=13
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Failed | Completed (Failed) | 1 | 2 | 27 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
Tests with issue trackers Missing: twim=2
Site | Build Name | Test Name | Status | Details | Consecutive Missing Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 3 | 15 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 1 | 17 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
@trilinos/teuchos @jwillenbring It appears that many of these are consistently passing. Can you take a look at it? Thanks!
@trilinos/framework, can someone please just disable this test on these 'ats2' builds? Spectrum MPI does not consistently behave as MPI should when a failure occurs. There is no long-term future for this platform so no reason to expect they are going to fix this type of behavior. (We are lucky if positive use cases run correctly on this platform.)
@ZUUL42 Can you take a look at this please?
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
@ZUUL42 Any progress? Thank, Curt.
@ccober6 I made the changes and ran a test on Jun 02, but something was off and I got 2 build errors out of Trilinos-atdm-ats2-gnu-7.3.1-spmpi-rolling_serial_static_opt-exp. So, I'm going to check things and run it again to see what I get.
Tests with issue trackers Passed: twip=12
Tests with issue trackers Missing: twim=4
Site | Build Name | Test Name | Status | Details | Consecutive Missing Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 2 | 15 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 3 | 15 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 4 | 13 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 3 | 15 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=11
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-gnu-7.3.1-spmpi-rolling_serial_static_dbg | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Failed | Completed (Failed) | 1 | 1 | 23 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=12
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=12
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
Tests with issue trackers Missing: twim=2
Site | Build Name | Test Name | Status | Details | Consecutive Missing Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 5 | 24 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 4 | 25 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
@ZUUL42 @jwillenbring Have you had a chance to look at these?
@ccober6 I got off onto some other things and am now OOO until July 27th. I'll take a look at it again then.
Tests with issue trackers Passed: twip=16
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
Tests with issue trackers Missing: twim=2
Site | Build Name | Test Name | Status | Details | Consecutive Missing Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 6 | 24 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 4 | 26 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=15
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Failed | Completed (Failed) | 1 | 2 | 28 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=15
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Failed | Completed (Failed) | 1 | 2 | 28 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=15
Tests with issue trackers Failed: twif=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Failed | Completed (Failed) | 1 | 2 | 28 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=14
Tests with issue trackers Missing: twim=2
Site | Build Name | Test Name | Status | Details | Consecutive Missing Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_2 | Missing / Failed | Completed (Failed) | 0 | 7 | 22 | #8759 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | TeuchosComm_UnitTestHarness_Parallel_UnitTests_MPI_4 | Missing / Failed | Completed (Failed) | 0 | 5 | 24 | #8759 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
This was migrated to the Trilinos Help Desk, so it is being closed now.
CC: @trilinos/teuchos, @jwillenbring (Trilinos Frameworks Product Lead), @bartlettroscoe
Next Action Status