trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.18k stars 559 forks source link

ROL w/ ROCm 5.4.3: vector/ROL_TpetraMultiVector.hpp:83:23: error: reference to __host__ function 'apply' in __host__ __device__ function #12353

Open eugeneswalker opened 9 months ago

eugeneswalker commented 9 months ago

Bug Report

@csiefer2 @nchaimov

Description

Building trilinos@master +testing +rocm amdgpu_target=gfx90a +amesos +amesos2 +anasazi +aztec +belos +boost +epetra +epetraext +ifpack +ifpack2 +intrepid +intrepid2 +isorropia +kokkos +ml +minitensor +muelu +nox +piro +phalanx +rol +rythmos +sacado +stk +shards +shylu +stokhos +stratimikos +teko +tempus +tpetra +trilinoscouplings +zoltan +zoltan2 +superlu-dist gotype=long_long fails using:

$> spack dev-build -j48 trilinos@master +testing +rocm amdgpu_target=gfx90a +amesos +amesos2 +anasazi +aztec +belos +boost +epetra +epetraext +ifpack +ifpack2 +intrepid +intrepid2 +isorropia +kokkos +ml +minitensor +muelu +nox +piro +phalanx +rol +rythmos +sacado +stk +shards +shylu +stokhos +stratimikos +teko +tempus +tpetra +trilinoscouplings +zoltan +zoltan2 +superlu-dist gotype=long_long ^superlu-dist
...
In file included from /e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/adapters/tpetra/test/vector/test_01.cpp:52:
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/adapters/tpetra/src/vector/ROL_TpetraMultiVector.hpp:83:23: error: reference to __host__ function 'apply' in __host__ __device__ function
        X_(i,j) = f_->apply(X_(i,j));
                      ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_Parallel_Range.hpp:52:5: note: called by 'exec_range<void>'
    m_functor(i);
    ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_Parallel_Range.hpp:73:22: note: called by 'operator()'
      this->template exec_range<WorkTag>(iwork);
                     ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_KernelLaunch.hpp:79:3: note: called by 'hip_parallel_launch_local_memory<Kokkos::Impl::ParallelFor<ROL::TPMultiVector::unaryFunc<double>, Kokkos::RangePolicy<Kokkos::HIP>, Kokkos::HIP>, 1024U, 1U>'
  driver();
  ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/src/elementwise/ROL_Elementwise_Function.hpp:59:16: note: 'apply' declared here
  virtual Real apply( const Real &x ) const = 0;
...

Steps to Reproduce

Reproducible here using Docker container image:

Spack environment: spack.yaml.txt

Concretization
 -   ssbhlz2  trilinos@master%gcc@11.4.0~adelus~adios2+amesos+amesos2+anasazi+aztec~basker+belos+boost~chaco~complex~cuda~cuda_rdc~debug~dtk+epetra+epetraext~epetraextbtf~epetraextexperimental~epetraextgraphreorderings~exodus+explicit_template_instantiation~float+fortran~gtest~hdf5~hypre+ifpack+ifpack2+intrepid+intrepid2~ipo+isorropia+kokkos~mesquite+minitensor+ml+mpi+muelu~mumps+nox~openmp~panzer+phalanx+piro~python+rocm~rocm_rdc+rol+rythmos+sacado~scorec+shards+shared+shylu+stk+stokhos+stratimikos~strumpack~suite-sparse~superlu+superlu-dist+teko+tempus+testing+thyra+tpetra+trilinoscouplings~wrapper~x11+zoltan+zoltan2 amdgpu_target=gfx90a build_system=cmake build_type=Release cxxstd=17 generator=make gotype=long_long arch=linux-ubuntu20.04-x86_64
[+]  ia5ufsp      ^boost@1.83.0%gcc@11.4.0~atomic~chrono~clanglibcpp~container~context~contract~coroutine~date_time~debug+exception~fiber~filesystem+graph~graph_parallel~icu~iostreams~json~locale~log+math+mpi+multithreaded~nowide~numpy~pic~program_options~python~random~regex~serialization+shared~signals~singlethreaded+stacktrace~system~taggedlayout~test~thread~timer~type_erasure~versionedlayout~wave build_system=generic cxxstd=17 patches=a440f96 visibility=hidden arch=linux-ubuntu20.04-x86_64
[+]  j56nveb      ^cmake@3.27.6%gcc@11.4.0~doc+ncurses+ownlibs build_system=generic build_type=Release arch=linux-ubuntu20.04-x86_64
[+]  krhswgy          ^curl@8.1.2%gcc@11.4.0~gssapi~ldap~libidn2~librtmp~libssh~libssh2+nghttp2 build_system=autotools libs=shared,static tls=openssl arch=linux-ubuntu20.04-x86_64
[+]  y72nyuk              ^nghttp2@1.52.0%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  m2nqw34              ^openssl@3.1.3%gcc@11.4.0~docs+shared build_system=generic certs=mozilla arch=linux-ubuntu20.04-x86_64
[+]  thgudgh                  ^ca-certificates-mozilla@2023-05-30%gcc@11.4.0 build_system=generic arch=linux-ubuntu20.04-x86_64
[+]  4ahtcrh          ^ncurses@6.4%gcc@11.4.0~symlinks+termlib abi=none build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  axbtmvn      ^gmake@4.4.1%gcc@11.4.0~guile build_system=autotools arch=linux-ubuntu20.04-x86_64
[e]  m4alb4r      ^hip@5.4.3%gcc@11.4.0~cuda+rocm build_system=cmake build_type=Release generator=make patches=5068750,c2ee21c,ca523f1,ddd86f0 arch=linux-ubuntu20.04-x86_64
[e]  gwdu4x7      ^hsa-rocr-dev@5.4.3%gcc@11.4.0+image+shared build_system=cmake build_type=Release generator=make patches=71e6851 arch=linux-ubuntu20.04-x86_64
[+]  ytgmnmi      ^hwloc@2.9.1%gcc@11.4.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml~oneapi-level-zero~opencl+pci~rocm build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
[+]  wjzjxgg          ^libpciaccess@0.17%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  3cmrrml              ^libtool@2.4.7%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  dije4dh                  ^m4@1.4.19%gcc@11.4.0+sigsegv build_system=autotools patches=9dc5fbd,bfdffa7 arch=linux-ubuntu20.04-x86_64
[+]  uxiki2z                      ^libsigsegv@2.14%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  pv4s6pa              ^util-macros@1.19.3%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  w3n2rgb          ^libxml2@2.10.3%gcc@11.4.0+pic~python+shared build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  bb6pvt7              ^libiconv@1.17%gcc@11.4.0 build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
[+]  7gqmxvf              ^xz@5.4.1%gcc@11.4.0~pic build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
[+]  hjpgnti          ^pkgconf@1.9.5%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[e]  6gz5p46      ^llvm-amdgpu@5.4.3%gcc@11.4.0~link_llvm_dylib~llvm_dylib~openmp+rocm-device-libs build_system=cmake build_type=Release generator=ninja patches=a08bbe1 arch=linux-ubuntu20.04-x86_64
[+]  l6dookb      ^metis@5.1.0%gcc@11.4.0~gdb~int64~ipo~real64+shared build_system=cmake build_type=Release generator=make patches=4991da9,93a7903,b1225da arch=linux-ubuntu20.04-x86_64
[e]  4vf2w2d      ^mpich@4.1.2%gcc@11.4.0~argobots~cuda+fortran~hwloc+hydra+libxml2+pci~rocm+romio~slurm~vci~verbs~wrapperrpath build_system=autotools datatype-engine=auto device=ch4 netmod=ofi pmi=pmi arch=linux-ubuntu20.04-x86_64
[+]  2n67kk5      ^openblas@0.3.24%gcc@11.4.0~bignuma~consistent_fpcsr+fortran~ilp64+locking+pic+shared build_system=makefile symbol_suffix=none threads=none arch=linux-ubuntu20.04-x86_64
[+]  ymeouv6          ^perl@5.38.0%gcc@11.4.0+cpanm+opcode+open+shared+threads build_system=generic patches=714e4d1 arch=linux-ubuntu20.04-x86_64
[+]  z5d3lru              ^berkeley-db@18.1.40%gcc@11.4.0+cxx~docs+stl build_system=autotools patches=26090f4,b231fcc arch=linux-ubuntu20.04-x86_64
[+]  mxcjdbv              ^bzip2@1.0.8%gcc@11.4.0~debug~pic+shared build_system=generic arch=linux-ubuntu20.04-x86_64
[+]  zafnh7n                  ^diffutils@3.9%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  adx2gsg              ^gdbm@1.23%gcc@11.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
[+]  bhdepb5                  ^readline@8.2%gcc@11.4.0 build_system=autotools patches=bbf97f1 arch=linux-ubuntu20.04-x86_64
[+]  wcatis4      ^parmetis@4.0.3%gcc@11.4.0~gdb~int64~ipo+shared build_system=cmake build_type=Release generator=make patches=4f89253,50ed208,704b84f arch=linux-ubuntu20.04-x86_64
[+]  ur6zd25      ^superlu-dist@develop%gcc@11.4.0~cuda~int64~ipo~openmp~rocm+shared build_system=cmake build_type=Release generator=make arch=linux-ubuntu20.04-x86_64
[+]  hpmj5kc      ^zlib-ng@2.1.3%gcc@11.4.0+compat+opt build_system=autotools patches=299b958,ae9077a,b692621 arch=linux-ubuntu20.04-x86_64
$> docker run -it ecpe4s/ubuntu20.04-runner-amd64-gcc-11.4-rocm5.4.3-mpi-base:2023.08.20
root@a6fe029d2d1d:/# git clone https://github.com/eugeneswalker/spack
root@a6fe029d2d1d:/# git -C spack checkout trilinos-ctest

root@a6fe029d2d1d:/# . spack/share/spack/setup-env.sh
root@a6fe029d2d1d:/# spack env activate -d .
root@a6fe029d2d1d:/# spack concretize -f | tee concretize.log
root@a6fe029d2d1d:/# spack install --only dependencies --include-build-deps
... OK

root@a6fe029d2d1d:/# git clone --depth=1 https://github.com/trilinos/Trilinos.git
root@a6fe029d2d1d:/# git -C Trilinos checkout 5aaae1ada6fe1ce777e671a0ff84fdc4f0779406

root@a6fe029d2d1d:/# cd Trilinos
root@a6fe029d2d1d:/# nohup bash -c "time spack dev-build -j48 trilinos@master +rocm amdgpu_target=gfx90a +amesos +amesos2 +anasazi +aztec +belos +boost +epetra +epetraext +ifpack +ifpack2 +intrepid +intrepid2 +isorropia +kokkos +ml +minitensor +muelu +nox +piro +phalanx +rol +rythmos +sacado +stk +shards +shylu +stokhos +stratimikos +teko +tempus +tpetra +trilinoscouplings +zoltan +zoltan2 +superlu-dist gotype=long_long ^superlu-dist" &

root@a6fe029d2d1d:/# tail -f nohup.out
...

In file included from /e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/adapters/tpetra/test/vector/test_01.cpp:52:
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/adapters/tpetra/src/vector/ROL_TpetraMultiVector.hpp:83:23: error: reference to __host__ function 'apply' in __host__ __device__ function
        X_(i,j) = f_->apply(X_(i,j));
                      ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_Parallel_Range.hpp:52:5: note: called by 'exec_range<void>'
    m_functor(i);
    ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_Parallel_Range.hpp:73:22: note: called by 'operator()'
      this->template exec_range<WorkTag>(iwork);
                     ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/kokkos/core/src/HIP/Kokkos_HIP_KernelLaunch.hpp:79:3: note: called by 'hip_parallel_launch_local_memory<Kokkos::Impl::ParallelFor<ROL::TPMultiVector::unaryFunc<double>, Kokkos::RangePolicy<Kokkos::HIP>, Kokkos::HIP>, 1024U, 1U>'
  driver();
  ^
/e4s-develop/collab/trilinos/ctest-rocm/Trilinos/packages/rol/src/elementwise/ROL_Elementwise_Function.hpp:59:16: note: 'apply' declared here
  virtual Real apply( const Real &x ) const = 0;
...
jhux2 commented 9 months ago

This looks like it could be an issue in the ROL adapters, actually.

jhux2 commented 9 months ago

@trilinos/rol

cgcgcg commented 9 months ago

Is this related to #12105?

jjellio commented 9 months ago

I've had ROL's tests disabled in my builds for a while

eugeneswalker commented 9 months ago

I've had ROL's tests disabled in my builds for a while

Can you share how you disabled those? I would like to build the ctests but disable the ones that don't build because of various errors.

jjellio commented 9 months ago

You'd need to edit the spack package I'm guessing, and

 "-DROL_ENABLE_TESTS=OFF" \
 "-DROL_ENABLE_EXAMPLES=OFF" \

in spack, something like

spack edit trilinos:

Then put the quoted disables in:
                define_trilinos_enable("ROL"),
                "-DROL_ENABLE_TESTS:BOOL=OFF",
                "-DROL_ENABLE_EXAMPLES:BOOL=OFF",
                define_trilinos_enable("Rythmos"),
dridzal commented 9 months ago

Is this related to:

https://github.com/trilinos/Trilinos/pull/12105

? The ROL team is meeting today to evaluate a re-write of elementwise vector operations, which currently don't work with HIP.

dridzal commented 9 months ago

@gregvw , @dpkouri , here are additional references to ROCm/HIP problems.

dridzal commented 9 months ago

All: In the HIP documentation we found the sentence:

HIP allows coding in a single-source C++ programming language including features such as templates, C++11 lambdas, classes, namespaces, and more.

Is HIP limited to C++11?

jjellio commented 9 months ago

You should be able to use legal Kokkos with HIP. Meaning if the lambda you want to use is legal with Kokkos it should compile under HIP. (AMD has been pretty good about keeping Kokkos passing regression tests). I regularly build with std=C++17. I can't say if higher standards are working yet.