STEllAR-GROUP / octotiger

Astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees
http://octotiger.stellar-group.org/
Boost Software License 1.0
48 stars 17 forks source link

PR #487 breaks compilation on Perlmutter #490

Closed JiakunYan closed 3 months ago

JiakunYan commented 4 months ago

Commit dd8418005396e28fc0c9bd827e8e5c288171878b works, but commit 3e4e38215e000335f95170219fcb89f18d93fbfc does not work. I got a bunch of errors such as

     5266    [ 90%] Building CXX object CMakeFiles/octolib.dir/src/unitiger/hydro_impl/hydro_boundary_exchange_vc.cpp.o
     5267    /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/kokkos-4.0.01-r74dvxdouxmynzuo6ifcbmid7hsmffde/bin/kokkos_launch_compiler /
             global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/kokkos-4.0.01-r74dvxdouxmynzuo6ifcbmid7hsmffde/bin/nvcc_wrapper /global/u1/j
             /jackyan/workspace/spack/lib/spack/env/gcc/g++ /global/u1/j/jackyan/workspace/spack/lib/spack/env/gcc/g++ -DCPPUDDLE_HAVE_HPX -DCPPUDDLE_HAVE_HPX_AWARE
             _ALLOCATORS -DCPPUDDLE_HAVE_MAX_NUMBER_GPUS=4 -DCPPUDDLE_HAVE_NUMBER_BUCKETS=128 -DH5_BUILT_AS_DYNAMIC_LIB -DHPX_KOKKOS_CUDA_FUTURE_TYPE=0 -DHPX_KOKKOS
             _SYCL_FUTURE_TYPE=0 -DHPX_LIBRARY_EXPORTS -DHPX_WITH_CUDA -DKOKKOS_DEPENDENCE -DOCTOTIGER_GRIDDIM=8 -DOCTOTIGER_HAVE_BLAST_TEST -DOCTOTIGER_HAVE_CUDA -
             DOCTOTIGER_HAVE_KOKKOS -DOCTOTIGER_HAVE_QUADMATH -DOCTOTIGER_HAVE_VC -DOCTOTIGER_KOKKOS_HYDRO_TASKS=1 -DOCTOTIGER_KOKKOS_MONOPOLE_TASKS=1 -DOCTOTIGER_K
             OKKOS_MULTIPOLE_TASKS=1 -DOCTOTIGER_KOKKOS_SIMD_AUTOMATIC_DISCOVERY -DOCTOTIGER_MAX_NUMBER_FIELDS=15 -DOCTOTIGER_THETA_MINIMUM=0.34 -D_FILE_OFFSET_BITS
             =64 -D_GNU_SOURCE -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200809L -Doctolib_EXPORTS -I/tmp/jackyan/spack-stage/spack-stage-octotige
             r-git.3e4e38215e000335f95170219fcb89f18d93fbfc=0.10.0-git.150-vz7l6uo76a3v7f6uhybmayxmwgrx5v4o/spack-src -I/tmp/jackyan/spack-stage/spack-stage-octotig
             er-git.3e4e38215e000335f95170219fcb89f18d93fbfc=0.10.0-git.150-vz7l6uo76a3v7f6uhybmayxmwgrx5v4o/spack-src/spack-build/_deps/kokkossimd-src -isystem /gl
             obal/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/hpx-master-xjidl722ivi4id4unf2lnr5kpuakynpp/include -isystem /opt/nvidia/hpc_s
             dk/Linux_x86_64/23.9/cuda/12.2/include -isystem /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/include -isystem /global/u1/j/jackyan/workspace/spack/o
             pt/spack/linux-sles15-zen3/gcc-12.3.0/boost-1.82.0-m7ctnqudgeuajopl7rzh4bnpe2t2zfrf/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/lin
             ux-sles15-zen3/gcc-12.3.0/asio-1.21.0-2uxagrjirzugppuuito6y3gw62f2vify/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen
             3/gcc-12.3.0/silo-4.11-r3tpfqq3klr6bn5uzu5uid2u33jgnchl/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/hd
             f5-1.14.1-2-k4r6uijq2qg32lull65pauugqfwaxfig/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/zlib-1.2.13-2
             swgzmmeg3yzdtaikdxsp5ukrsj4drsy/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/cppuddle-master-t2crgm6ydq
             wbh76x2p5i2olur7n5dxja/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/vc-1.4.1-b2oixjujurkio3cbeb4gk6nd2u
             5cn22z/include -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/kokkos-4.0.01-r74dvxdouxmynzuo6ifcbmid7hsmffde/incl
             ude -isystem /global/u1/j/jackyan/workspace/spack/opt/spack/linux-sles15-zen3/gcc-12.3.0/hpx-kokkos-0.4.0-riqiungduvh64bh4xbs3ehxsvw52kqjb/include -Wno
             -cpp -fPIC -O3 -DNDEBUG -fPIC -pthread -march=znver3 -mtune=znver3 -expt-extended-lambda -Wext-lambda-captures-this -expt-relaxed-constexpr -arch=sm_80
              -std=c++17  -isystem /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/targets/x86_64-linux/include --host-only -MD -MT CMakeFiles/octolib.dir/src/uniti
             ger/hydro_impl/hydro_boundary_exchange_vc.cpp.o -MF CMakeFiles/octolib.dir/src/unitiger/hydro_impl/hydro_boundary_exchange_vc.cpp.o.d -o CMakeFiles/oct
             olib.dir/src/unitiger/hydro_impl/hydro_boundary_exchange_vc.cpp.o -c /tmp/jackyan/spack-stage/spack-stage-octotiger-git.3e4e38215e000335f95170219fcb89f
             18d93fbfc=0.10.0-git.150-vz7l6uo76a3v7f6uhybmayxmwgrx5v4o/spack-src/src/unitiger/hydro_impl/hydro_boundary_exchange_vc.cpp
  >> 5268    /usr/lib64/gcc/x86_64-suse-linux/12/include/avx512fp16intrin.h(38): error: vector_size attribute requires an arithmetic or enum type
     5269      typedef __half __v8hf __attribute__ ((__vector_size__ (16)));
     5270                                            ^
     5271    
  >> 5272    /usr/lib64/gcc/x86_64-suse-linux/12/include/avx512fp16intrin.h(39): error: vector_size attribute requires an arithmetic or enum type
     5273      typedef __half __v16hf __attribute__ ((__vector_size__ (32)));
     5274                                             ^
     5275    
  >> 5276    /usr/lib64/gcc/x86_64-suse-linux/12/include/avx512fp16intrin.h(40): error: vector_size attribute requires an arithmetic or enum type
     5277      typedef __half __v32hf __attribute__ ((__vector_size__ (64)));

Speck spec:

Input spec
--------------------------------
octotiger@git.3e4e38215e000335f95170219fcb89f18d93fbfc=0.10.0-git.150%gcc@12.3.0 cppflags="-L/opt/cray/pe/mpich/8.1.28/gtl/lib -lmpi_gtl_cuda" +cuda+kokkos cuda_arch=80
    ^cppuddle@master max_number_gpus=4
    ^cray-mpich
    ^hpx@master max_cpu_count=256 networking=lci,mpi
    ^lci@master+benchmarks+examples+papi+tests default-pm=cray fabric=ofi
    ^silo~mpi

Concretized
--------------------------------
octotiger@git.3e4e38215e000335f95170219fcb89f18d93fbfc=0.10.0-git.150%gcc@12.3.0 cppflags="-L/opt/cray/pe/mpich/8.1.28/gtl/lib -lmpi_gtl_cuda" +cuda~fast_fp_contract~ipo+kokkos~kokkos_hpx_kernels~rocm~sycl build_system=cmake build_type=Release cuda_arch=80 generator=make griddim=8 hydro_host_tasks=1 monopole_host_tasks=1 multipole_host_tasks=1 simd_extension=DISCOVER simd_library=KOKKOS theta_minimum=0.34 arch=linux-sles15-zen3
    ^boost@1.82.0%gcc@12.3.0+atomic+chrono~clanglibcpp~container~context~contract~coroutine+date_time~debug+exception~fiber+filesystem+graph~graph_parallel~icu+iostreams~json+locale+log+math~mpi+multithreaded~nowide~numpy~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded~stacktrace+system~taggedlayout+test+thread+timer~type_erasure~versionedlayout+wave build_system=generic cxxstd=17 patches=a440f96,a7c807f visibility=hidden arch=linux-sles15-zen3
        ^bzip2@1.0.8%gcc@12.3.0~debug~pic+shared build_system=generic arch=linux-sles15-zen3
            ^diffutils@3.6%gcc@12.3.0 build_system=autotools arch=linux-sles15-zen3
        ^xz@5.4.1%gcc@12.3.0~pic build_system=autotools libs=shared,static arch=linux-sles15-zen3
        ^zlib@1.2.13%gcc@12.3.0+optimize+pic+shared build_system=makefile arch=linux-sles15-zen3
        ^zstd@1.5.5%gcc@12.3.0~programs build_system=makefile libs=shared,static arch=linux-sles15-zen3
    ^cmake@3.22.0%gcc@12.3.0~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=linux-sles15-zen3
    ^cppuddle@master%gcc@12.3.0~allocator_counters+buffer_content_recycling+buffer_recycling~enable_gpu_tests+executor_recycling+hpx~ipo build_system=cmake build_type=Release generator=make max_number_gpus=4 number_buffer_buckets=128 arch=linux-sles15-zen3
    ^cuda@12.2%gcc@12.3.0~allow-unsupported-compilers~dev build_system=generic arch=linux-sles15-zen3
    ^gmake@4.2.1%gcc@12.3.0~guile build_system=autotools patches=ca60bd9,fe5b60d arch=linux-sles15-zen3
    ^hdf5@1.14.1-2%gcc@12.3.0~cxx~fortran+hl~ipo~java~map~mpi+shared+szip+threadsafe+tools api=default build_system=cmake build_type=Release generator=make arch=linux-sles15-zen3
        ^libaec@1.0.6%gcc@12.3.0~ipo+shared build_system=cmake build_type=Release generator=make arch=linux-sles15-zen3
        ^pkgconf@1.9.5%gcc@12.3.0 build_system=autotools arch=linux-sles15-zen3
    ^hpx@master%gcc@12.3.0+async_cuda+async_gpu_futures~async_mpi+cuda~examples~generic_coroutines~ipo~lci_pp_log~lci_pp_pcounter~rocm~sycl~tools build_system=cmake build_type=Release cuda_arch=80 cxxstd=17 generator=ninja instrumentation=none malloc=tcmalloc max_cpu_count=256 networking=lci,mpi sycl_target_arch=none arch=linux-sles15-zen3
        ^asio@1.21.0%gcc@12.3.0~boost_coroutine~boost_regex~separate_compilation build_system=autotools cxxstd=17 arch=linux-sles15-zen3
        ^cray-mpich@8.1.28%gcc@12.3.0+wrappers build_system=generic arch=linux-sles15-zen3
        ^git@2.35.3%gcc@12.3.0+man+nls+perl+subtree~svn~tcltk build_system=autotools arch=linux-sles15-zen3
        ^gperftools@2.10%gcc@12.3.0+debugalloc~dynamic_sized_delete_support+libunwind~sized_delete build_system=autotools arch=linux-sles15-zen3
            ^libunwind@1.6.2%gcc@12.3.0~block_signals~conservative_checks~cxx_exceptions~debug~debug_frame+docs~pic+tests+weak_backtrace~xz~zlib build_system=autotools components=none libs=shared,static arch=linux-sles15-zen3
        ^hwloc@2.8.0%gcc@12.3.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml~oneapi-level-zero~opencl+pci~rocm build_system=autotools libs=shared,static arch=linux-sles15-zen3
        ^lci@master%gcc@12.3.0+aligned+benchmarks~debug~debug-slow~docs+examples~gprof+ibv-td~inline-cq~ipo+multithread-progress+native+papi~pcounter+shared+tests+vector build_system=cmake build_type=Release cache-line=auto completion=am,cq,sync default-dreg=auto default-max-cqe=auto default-max-recvs=auto default-max-sends=auto default-packet-size=auto default-packets=auto default-pm=cray enable-pmix=auto fabric=ofi generator=ninja arch=linux-sles15-zen3
            ^cray-pmi@6.1.13%gcc@12.3.0 build_system=generic arch=linux-sles15-zen3
            ^libfabric@1.15.2%gcc@12.3.0~debug~kdreg build_system=autotools fabrics=cxi,sockets,tcp,udp arch=linux-sles15-zen3
            ^papi@7.0.1.2%gcc@12.3.0~cuda+example~infiniband~lmsensors~nvml~powercap~rapl~rocm~rocm_smi~sde+shared~static_tools build_system=autotools arch=linux-sles15-zen3
        ^ninja@1.10.0%gcc@12.3.0+re2c build_system=generic arch=linux-sles15-zen3
        ^python@3.9.13%gcc@12.3.0+bz2+crypt+ctypes+dbm~debug+libxml2+lzma+nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl+tix+tkinter+uuid+zlib build_system=generic patches=0d98e93,f2fd060 arch=linux-sles15-zen3
    ^hpx-kokkos@0.4.0%gcc@12.3.0+cuda~ipo~rocm~sycl build_system=cmake build_type=Release cuda_arch=80 cxxstd=17 future_type=polling generator=make arch=linux-sles15-zen3
    ^kokkos@4.0.01%gcc@12.3.0+aggressive_vectorization~compiler_warnings+cuda+cuda_constexpr+cuda_lambda~cuda_ldg_intrinsic~cuda_relocatable_device_code~cuda_uvm~debug~debug_bounds_check~debug_dualview_modify_check~deprecated_code~examples+hpx+hpx_async_dispatch~hwloc~ipo~memkind~numactl~openmp~openmptarget~pic~rocm+serial+shared~sycl~tests~threads~tuning+wrapper build_system=cmake build_type=Release cuda_arch=80 generator=make intel_gpu_arch=none patches=5e61580,b26a011 std=17 use_unsupported_sycl_arch=none arch=linux-sles15-zen3
        ^kokkos-nvcc-wrapper@4.0.01%gcc@12.3.0 build_system=generic arch=linux-sles15-zen3
    ^silo@4.11%gcc@12.3.0+fortran+fpzip+hdf5+hzip~mpi+pic+shared~silex build_system=autotools patches=451c4c5,a081263,eb2a3a0,fa050e0 arch=linux-sles15-zen3
        ^autoconf@2.69%gcc@12.3.0 build_system=autotools patches=7793209 arch=linux-sles15-zen3
        ^autoconf-archive@2023.02.20%gcc@12.3.0 build_system=autotools arch=linux-sles15-zen3
        ^automake@1.15.1%gcc@12.3.0 build_system=autotools arch=linux-sles15-zen3
        ^libtool@2.4.6%gcc@12.3.0 build_system=autotools arch=linux-sles15-zen3
        ^m4@1.4.18%gcc@12.3.0+sigsegv build_system=autotools patches=3877ab5,fc9b616 arch=linux-sles15-zen3
        ^readline@8.2%gcc@12.3.0 build_system=autotools patches=bbf97f1 arch=linux-sles15-zen3
            ^ncurses@6.4%gcc@12.3.0~symlinks+termlib abi=none build_system=autotools arch=linux-sles15-zen3
    ^vc@1.4.1%gcc@12.3.0~ipo build_system=cmake build_type=Release generator=make arch=linux-sles15-zen3
JiakunYan commented 3 months ago

It has been fixed by #491