trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.22k stars 568 forks source link

Post-push CI build showing build errors were other builds are not #7195

Closed bartlettroscoe closed 4 years ago

bartlettroscoe commented 4 years ago

Looking at the CI build shown in this query:

one can see that the Kokkos 3.1 promotion from PR #7172 yesterday broke the CI build shown here. The first build errors were in the KokkosKernels package showing:

In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:55:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMDenseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, mpool_type>::get_thread_id(size_t) const’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp:120:11: error: ‘impl_hardware_thread_id’ is not a member of ‘Kokkos::OpenMP’
    return Kokkos::OpenMP::impl_hardware_thread_id();
           ^
In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:56:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMSparseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, pool_memory_type>::get_thread_id(size_t) const’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp:209:11: error: ‘impl_hardware_thread_id’ is not a member of ‘Kokkos::OpenMP’
    return Kokkos::OpenMP::impl_hardware_thread_id();
           ^
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMSparseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, pool_memory_type>::get_thread_id(size_t) const [with a_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_row_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_nnz_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_scalar_view_t = Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; dinv_view_t = Kokkos::View<const double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; pool_memory_type = KokkosKernels::Impl::UniformMemoryPool<Kokkos::HostSpace, int>; HandleType = KokkosKernels::Experimental::KokkosKernelsHandle<const int, const int, const double, Kokkos::OpenMP, Kokkos::HostSpace, Kokkos::HostSpace>; a_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; size_t = long unsigned int]’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp:224:7: warning: control reaches end of non-void function [-Wreturn-type]
       }
       ^
In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:55:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMDenseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, mpool_type>::get_thread_id(size_t) const [with a_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_row_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_nnz_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_scalar_view_t = Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; dinv_view_t = Kokkos::View<const double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; mpool_type = KokkosKernels::Impl::UniformMemoryPool<Kokkos::HostSpace, double>; HandleType = KokkosKernels::Experimental::KokkosKernelsHandle<const int, const int, const double, Kokkos::OpenMP, Kokkos::HostSpace, Kokkos::HostSpace>; a_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; size_t = long unsigned int]’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp:131:7: warning: control reaches end of non-void function [-Wreturn-type]
       }
       ^

We are not seeing that build error in any of the Trilinos PR builds yet as shown here:

and it not being seen in any of the ATDM Trilinos builds today shown here:

bartlettroscoe commented 4 years ago

I deleted the build directory on the machine ceerws1113 and restarted the CI server from scratch with:

nohup \
env
  TRILINOS_CI_DO_INITIAL_REBUILD=1 \
  CTEST_BUILD_FLAGS="-j8 -k 999999" \
  CTEST_PARALLEL_LEVEL=8 \
./Trilinos/cmake/ctest/drivers/sems_ci/trilinos_ci_sever.sh \
  &> trilinos_ci_server.out &

Let's see what happens with this.

bartlettroscoe commented 4 years ago

What is concerning is that I have not been getting emails for all of those broken iterations. I only got the first email for the first error iteration. Not good.

csiefer2 commented 4 years ago

FYI - I have seen some "need to blitz the build directory" issues with the Kokkos upgrade.

bartlettroscoe commented 4 years ago

FYI - I have seen some "need to blitz the build directory" issues with the Kokkos upgrade.

@csiefer2, we have seen this before as reported in #6855. I need to add that solid reproducer for that case.

And indeed, a build from scratch for the CI build seems to have fixed the problem as shown in:

We really need to get someone to figure out why Kokkos is not rebuilding correctly after their big CMake refactor.

bartlettroscoe commented 4 years ago

I will go ahead and close this issue since it looks like the problem is resolved (until the next time Kokkos is updated or they fix #6855).