Open ndellingwood opened 9 months ago
@ndellingwood Blake compilers won't build squat. Both 2023.1 and 2023.2 are missing ocloc and LevelZero. @fryeguy52 is fixing the compilers. Will try to reproduce once he's done.
Thanks for the info @csiefer2 , sorry for any added noise with this issue
@ndellingwood Feel free to add me to SYCL
issues in Trilinos
.
@masterleinad sure thing. I'm putting in a printf fix shortly (just in case you're standing up a build and run into it). Regarding this issue, I should also point out a mistaken assumption in my configuration, I assumed Tpetra would enable SYCL based on Kokkos_ENABLE_SYCL=ON, but looking at the configure output I needed to enable SYCL for Tpetra explicitly. Rebuilding for a retest
With local changes in PR #12471 and setting Tpetra_INST_SYCL=ON, this is the set of test failures:
The following tests FAILED:
19 - TpetraCore_BlockCrsMatrix (Failed)
82 - TpetraCore_ImportExport2_UnitTests_Send (Failed)
83 - TpetraCore_ImportExport2_UnitTests_ISend (Failed)
84 - TpetraCore_ImportExport2_UnitTests_Alltoall (Failed)
140 - TpetraCore_getEntryOnHost (Failed)
Errors while running CTest
All of these tests are passing for me on the Intel testbeds.
@masterleinad which version of intel/oneapi and which architecture did you test?
@masterleinad which version of intel/oneapi and which architecture did you test?
oneapi/eng-compiler/2023.10.15.002
with Kokkos_ENABLE_SERIAL=ON
, Kokkos_ENABLE_SYCL=ON
and Kokkos_ARCH_INTEL_PVC=ON
.
That compiler is tagged as 2024.0.0
.
@masterleinad did you add Tpetra_INST_SYCL=ON
explicitly? If not, can you look over the configure output to confirm that SYCL was enabled for Tpetra?
For reference, I initially had not set that and had this warning in the configure output:
-- NOTE: Kokkos::SYCL is ON (the CMake option Kokkos_ENABLE_SYCL is ON), but the corresponding Tpetra Node type is disabled. If you want to enable instantiation and use of Kokkos::SYCL in Tpetra, please also set the CMake option Tpetra_INST_SYCL:BOOL=ON. If you use the Kokkos::SYCL version of Tpetra without doing this, you will get link errors!
-- Determine whether Tpetra will assume that MPI is GPU aware:
-- - Tpetra_INST_CUDA, Tpetra_INST_HIP and Tpetra_INST_SYCL atre OFF, so Tpetra will assume that MPI is not GPU aware.
-- Tpetra execution space availability (ON means available):
-- - Serial: ON
-- - Threads: OFF
-- - OpenMP: OFF
-- - Cuda: OFF
-- - HIP: OFF
-- - SYCL: OFF
@masterleinad did you add
Tpetra_INST_SYCL=ON
explicitly? If not, can you look over the configure output to confirm that SYCL was enabled for Tpetra?
Yes, it was set and I am seeing
[...]
-- Tpetra: Using internal Kokkos
-- Tpetra: Enabling deprecated code
-- Determine whether Tpetra will assume that MPI is GPU aware:
-- - TPL_ENABLE_MPI is OFF, so we assume that (nonexistent) MPI is not GPU aware.
-- Tpetra execution space availability (ON means available):
-- - Serial: ON
-- - Threads: OFF
-- - OpenMP: OFF
-- - Cuda: OFF
-- - HIP: OFF
-- - SYCL: ON
-- Tpetra: Tpetra_INST_INT_LONG_LONG is enabled by default.
-- Tpetra: Tpetra_INST_INT_UNSIGNED is disabled by default.
-- Tpetra: Tpetra_INST_INT_UNSIGNED_LONG is disabled by default.
-- Tpetra: Tpetra_INST_INT_INT is disabled by default.
-- Tpetra: Tpetra_INST_INT_LONG is disabled by default.
--
-- Tpetra: Validate global ordinal setting ...
-- Tpetra: global ordinal setting is OK
[...]
@masterleinad thanks! Can you post your configuration as well? I'd like to compare to see if I have misconfigured, but happened to get a complete build
I tried again with the configuration posted in the pull request description (https://github.com/trilinos/Trilinos/issues/12295#issue-1905712874) and see
TpetraCore_TpetraUtils_WrappedDualView (Failed)
TpetraCore_getEntryOnHost (Failed)
with MKL and see
TpetraCore_CrsMatrix_2DRandomDist
timing out. Previously (https://github.com/trilinos/Trilinos/issues/12295#issuecomment-1791383088) when I saw all tests passing, I was also pulling in Kokkos
develop
.
Bug Report
@trilinos/tpetra
Description
I tested out a Sycl configuration on new Blake's Ponte Vecchio GPUs and with Daniel's PR #12294 updates, the following tests failed with seg faults:
Steps to Reproduce
Use changes with #12294
Configuration (New) Blake PV queue: