Closed PDoakORNL closed 3 years ago
For documenting purpose: On Summit, gcc/8.1.1 and cuda/10 doesn't work well as I got the following error. The error goes away when I update cuda to 11 and use gcc/8.1.1 & magma/2.5.4-cuda11.1.
CMake Error at /autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.18.2-cirtl5oah4d6bequfcoji6jbetertrna/share/cmake-3.18/Modules/CMakeTestCUDACompiler.cmake:52 (message):
The CUDA compiler
"/sw/summit/cuda/10.1.243/bin/nvcc"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /gpfs/alpine/proj-shared/cph102/weile/dev/src/adios2/DCA-2/build/CMakeFiles/CMakeTmp
Run Build Command(s):/usr/bin/gmake cmTC_a16eb/fast && /usr/bin/gmake -f CMakeFiles/cmTC_a16eb.dir/build.make CMakeFiles/cmTC_a16eb.dir/build
gmake[1]: Entering directory `/gpfs/alpine/cph102/proj-shared/weile/dev/src/adios2/DCA-2/build/CMakeFiles/CMakeTmp'
Building CUDA object CMakeFiles/cmTC_a16eb.dir/main.cu.o
/sw/summit/cuda/10.1.243/bin/nvcc -c /gpfs/alpine/proj-shared/cph102/weile/dev/src/adios2/DCA-2/build/CMakeFiles/CMakeTmp/main.cu -o CMakeFiles/cmTC_a16eb.dir/main.cu.o
/autofs/nccs-svm1_sw/summit/gcc/8.1.1/include/c++/8.1.1/type_traits(347): error: identifier "__ieee128" is undefined
/autofs/nccs-svm1_sw/summit/gcc/8.1.1/include/c++/8.1.1/bits/std_abs.h(101): error: identifier "__ieee128" is undefined
/autofs/nccs-svm1_sw/summit/gcc/8.1.1/include/c++/8.1.1/bits/std_abs.h(102): error: identifier "__ieee128" is undefined
Similar error has been reported here: https://github.com/LLNL/blt/issues/341
The DCA with adios2 support version compiles and runs.
However, I wonder how to view G4 through adios2? More documentation is needed.
I want to compare distributed G4 and non-distributed on to verify the correctness.
tp_accumulator_particle_hole_test build failed:
[ 59%] Built target tp_accumulator_gpu_test
[ 59%] Linking CXX executable tp_accumulator_particle_hole_test
/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-8.1.1/libpng-1.6.34-whgrengqivmmm75oeeiwgsczqddqoh7i/lib/libpng16.so.16: undefined reference to `inflateValidate@ZLIB_1.2.9'
/usr/bin/ld: link errors found, deleting executable `tp_accumulator_particle_hole_test'
collect2: error: ld returned 1 exit status
make[2]: *** [test/unit/phys/dca_step/cluster_solver/shared_tools/accumulation/tp/tp_accumulator_particle_hole_test] Error 1
make[1]: *** [test/unit/phys/dca_step/cluster_solver/shared_tools/accumulation/tp/CMakeFiles/tp_accumulator_particle_hole_test.dir/all] Error 2
make: *** [all] Error 2
Working on this again today.
test this please
test this please
This brings relatively large changes to the way function distributed over ranks is treated. Also limited ADIOS2 support in master.
This brings adios2 up to date with current master, but I expect it won't build in daint, working that out.