Closed e10harvey closed 3 years ago
Tests with issue trackers Passed: twip=6
Tests with issue trackers Failed: twif=4
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 9 | 14 | 10 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt_cuda-aware-mpi | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 1 | 12 | 12 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 2 | 11 | 13 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt_cuda-aware-mpi | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 1 | 8 | 15 | #8544 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=6
Tests with issue trackers Failed: twif=4
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_complex_static_opt_cuda-aware-mpi | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 4 | 12 | 12 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 1 | 10 | 13 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 2 | 12 | 10 | #8544 |
vortex | Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt_cuda-aware-mpi | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Failed | Completed (Failed) | 1 | 10 | 12 | #8544 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
This test as well as the one mentioned in #8543 test interoperability with raw CUDA. In particular they test situations where CUDA is already used before Kokkos initialize and/or after Kokkos finalize. As such switching the GPU ID during Kokkos initialize will lead to the observed errors. One should NOT use any mechanism to tell Kokkos to choose a specific GPU. CUDA_VISIBLE_DEVICES probably works. In practice telling Kokkos to use device id 0 will also work (just not sure that CUDA guarantees that that is the default GPU).
One should NOT use any mechanism to tell Kokkos to choose a specific GPU. CUDA_VISIBLE_DEVICES probably works. In practice telling Kokkos to use device id 0 will also work (just not sure that CUDA guarantees that that is the default GPU).
@crtrott, that was not the appraoch/agreement we came to as part of:
Perhaps Kokkos needs to be updated to read in these CTest env vars earlier?
Changing to use CUDA_VISIBLE_DEVICES
would require writing an intermediate wrapper in TriBITS for every test that read in the ctest-set env vars and set CUDA_VISIBLE_DEVICES
accordingly. The design we came up with with for Ctest to not have to know about GPUs in particular and not have to modify TriBITS to coordinate the communication between CTest and Kokkos. But, again, we can extend TriBITS to do the needed translations (and perhaps we should) but that is just adding more control and complexity to TriBITS and making it a thicker wrapper of CMake/CTest.
With one should NOT use that mechanism: I mean specifically for those two tests. As I said I would recommend either disabling these two tests, or mark them as not runnable in parallel with other tests (is that a thing you can do?).
As I said I would recommend either disabling these two tests, or mark them as not runnable in parallel with other tests (is that a thing you can do?).
Yes and yes. For the former:
and for the latter:
As shown here, this test finished in less than 3s so I think we just need to add:
ATDM_SET_ENABLE(<fullTestName>_SET_RUN_SERIAL ON)
for each of these tests to:
right about here:
Need feedback from CDash before closing
Tests with issue trackers Passed: twip=4
Site | Build Name | Test Name | Status | Details | Consecutive Pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
ride | Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-debug | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Passed | Completed | 2 | 8 | 16 | #8544 |
ride | Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-release-debug | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Passed | Completed | 3 | 7 | 16 | #8544 |
ride | Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Passed | Completed | 4 | 8 | 16 | #8544 |
ride | Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release | KokkosCore_UnitTest_CudaInterOpStreams_MPI_1 | Passed | Completed | 3 | 10 | 13 | #8544 |
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
Tests with issue trackers Passed: twip=10
This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there are tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.
CC: @trilinos/kokkos, @crtrott (Trilinos Data Services Product Lead), @bartlettroscoe
Next Action Status