Open bartlettroscoe opened 4 years ago
The problem is that the two ctest -S drivers don't know about each other the second does not know what happened in the first.
To fix this, we could do as follows:
CTEST_DO_CONFIGURE=ON
then when ctest_start()
is run, write a file <build-dir>/ConfigureAttempted.txt
even before an update is attempted (and remove the file <build-dir>/ConfiguredPassed.txt
if it exists).ctest_configure()
completes and is successful, write the file <build-dir>/ConfiguredPassed.txt
.<build-dir>/ConfigureAttempted.txt
exists but <build-dir>/ConfiguredPassed.txt
does not exist, skip all future action and exist.I think that logic will ensure that if a configure was attempted and failed in a prior ctest -S invocation, then follow-up ctset -S script will skip everything and just exit.
Implement logic in TRIBITS_CTEST_DRIVER() that writes
This should have been done for a while as shown in commit https://github.com/bartlettroscoe/TriBITS/commit/4aa837b9d6feed947d533b5314ca77da6bcab7e2, merged to TriBITS 'master' in commit https://github.com/TriBITSPub/TriBITS/commit/43f2cc1db722fae65a5d044807d7cdd6f736356a, and merged the snapshot of TriBITS into Trilinos 'develop' through the Trilinos TriBITS snapshot commit trilinos/Trilinos@3143ca8 in Trilinos PR trilinos/Trilinos#7325.
But I am not sure this is working as it should since it looks like we still have some cases where builds and tests were attempted with configure failures as shown in:
which shows:
Specialized
Site | Build Name | Update | Update Time | Conf Err | Conf Warn | Conf Time | Build Err | Build Warn | Build Time | Test Not Run | Test Fail | Test Pass | Test Time | Test Proc Time | Start Test Time | Labels |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mutrino | Trilinos-atdm-ats1-hsw_intel-19.0.4_mpich-7.7.6_openmp_static_dbg | 927403 | 2m 18s | 2 | 0 | 11m 5s | 0 | 0 | 49m 46s | 0 | 0 | 0 | 0s | 0s | Jul 20, 2020 - 00:52 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-knl_intel-18.0.5_mpich-7.7.6_openmp_static_dbg | 927403 | 2m 6s | 2 | 8 | 1h 55m 14s | 0 | 0 | 40s | 0 | 0 | 0 | 0s | 0s | Jul 20, 2020 - 00:44 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-knl_intel-19.0.4_mpich-7.7.6_openmp_static_dbg | 927403 | 2m 6s | 2 | 8 | 2h 6m 13s | 0 | 0 | 53m 35s | 0 | 0 | 0 | 0s | 0s | Jul 20, 2020 - 00:32 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-hsw_intel-18.0.5_mpich-7.7.6_openmp_static_opt | 927403 | 2m 18s | 2 | 0 | 53m 30s | 0 | 50 | 13m 42s | 0 | 11 | 2275 | 4h 10m 9s | 9h 12m 9s | Jul 20, 2020 - 00:22 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-hsw_intel-19.0.4_mpich-7.7.6_openmp_static_opt | 927403 | 2m 48s | 2 | 8 | 1h 14m 22s | 0 | 50 | 22m 53s | 0 | 11 | 2096 | 3h 34m 6s | 7h 51m 25s | Jul 20, 2020 - 00:12 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-knl_intel-18.0.5_mpich-7.7.6_openmp_static_opt | 927403 | 2m 6s | 2 | 8 | 3h 4m 11s | 0 | 50 | 28m 22s | 0 | 17 | 2264 | 5h 58m 26s | 12h 14m 51s | Jul 20, 2020 - 00:02 MDT | (31 labels) |
mutrino | Trilinos-atdm-ats1-knl_intel-19.0.4_mpich-7.7.6_openmp_static_opt | 927403 | 2m 36s | 2 | 8 | 3h 43m 13s | 0 | 50 | 36m 56s | 0 | 16 | 2265 | 5h 58m 34s | 12h 14m 12s | Jul 19, 2020 - 23:52 MDT | (31 labels) |
I am going to have to keep an eye on this or even run a case manually that triggers this.
A problem that we are having with the ATDM Trilinos builds is that when the configure failures in the first ctest -S invocation that does the configure and build the second ctest -S invocation that runs the tests does not know that the configure failed (and the build never happened) and still runs the tests. The problem is that the test may still be sitting there from a prior day that actually build the tests.
We can see the problem in the ATDM Trilinos builds like show here showing:
and here showing:
This is bad and very confusing. If the configure does not pass, then no build or test results should be attempted or posted.