3dem / relion

Image-processing software for cryo-electron microscopy
https://relion.readthedocs.io/en/latest/
GNU General Public License v2.0
453 stars 202 forks source link

Error when building CPU-accelerated Relion4.0 (stable, build 138b9c7) #938

Closed andreanans closed 1 year ago

andreanans commented 1 year ago

Hi,

I'm having a problem building the CPU-accelerated stable version of Relion 4.0 (commit 138b9c7). I had previous success with an older commit (e3afcf9), so I'm not sure why it's failing now. Here is the displayed error:

Scanning dependencies of target copy_scripts Scanning dependencies of target class_ranker_model_file [ 0%] Built target class_ranker_model_file Scanning dependencies of target relion_lib [ 0%] Built target copy_scripts [ 0%] Building CXX object src/apps/CMakeFiles/relion_lib.dir//CPlot2D.cpp.o [ 0%] Building CXX object src/apps/CMakeFiles/relion_lib.dir///macros.cpp.o [ 0%] Building CXX object src/apps/CMakeFiles/relion_lib.dir//acc/cpu/cpu_backprojector.cpp.o [ 1%] Building CXX object src/apps/CMakeFiles/relion_lib.dir/__/acc/cpu/cpu_helper_functions.cpp.o [ 1%] Building CXX object src/apps/CMakeFiles/relion_lib.dir/__/acc/cpu/cpu_kernels/helper.cpp.o /camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/src/acc/cpu/cpu_kernels/diff2.h(547): error: syntax error in omp clause

pragma omp simd reduction(+:diffi[:eulers_per_block])

                                                              ^
      detected during instantiation of "void AccUtilities::diff2_coarse<REF3D,DATA3D,block_sz,eulers_per_block,prefetch_fraction>(unsigned long, int, float *, float *, float *, float *, float *, float *, AccProjectorKernel, float *, float *, unsigned long, unsigned long, cudaStream_t={float}) [with REF3D=true, DATA3D=true, block_sz=64, eulers_per_block=32, prefetch_fraction=4]" at line 880 of "/camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/src/acc/acc_helper_functions_impl.h"

compilation aborted for /camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/src/acc/cpu/cpu_helper_functions.cpp (code 2) make[2]: [src/apps/CMakeFiles/relion_lib.dir/__/acc/cpu/cpu_helper_functions.cpp.o] Error 2 make[2]: Waiting for unfinished jobs.... make[1]: *** [src/apps/CMakeFiles/relion_lib.dir/all] Error 2

I've tried re-building the older version (e3afcf9) under the same environment and it works fine. See below for my set up and cmake command and log. Any help would be appreciated. Thanks!

Andrea

ml intel/2018b ml icc/2018.3.222-GCC-7.3.0-2.30 ml CMake/3.12.1-GCCcore-7.3.0 ml imkl/2018.3.222-iimpi-2018b

source mpivars.sh intel64 source compilervars.sh intel64 source mklvars.sh intel64

export CC=mpiicc export CXX=mpiicpc

cmake -DCUDA=OFF -DALTCPU=ON -DCudaTexture=OFF -DMKLFFT=ON -DMPI_INCLUDE_PATH=/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include -DMPI_LIBRARIES=/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpi.so -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -D CMAKE_C_FLAGS="-O3 -ip -g -xCORE-AVX2 -march=haswell -restrict " -D CMAKE_CXX_FLAGS="-O3 -ip -g -xCORE-AVX2 -march=haswell -restrict " -DTIFF_LIBRARY=/camp/apps/eb/software/LibTIFF/4.0.9-GCCcore-6.4.0/lib/libtiff.so.5 -DTIFF_INCLUDE_DIR=/camp/apps/eb/software/LibTIFF/4.0.9-GCCcore-6.4.0/include/ -DCUDA_ARCH=70 -DJPEG_LIBRARY=/camp/apps/eb/software/libjpeg-turbo/2.0.0-GCCcore-7.3.0/lib64/libjpeg.so.8 -DJPEG_INCLUDE_DIR=/camp/apps/eb/software/libjpeg-turbo /2.0.0-GCCcore-7.3.0/include -DPNG_LIBRARY=/camp/apps/eb/software/libpng/1.6.34-GCCcore-7.3.0/lib/libpng16.so -DPNG_INCLUDE_DIR=/camp/apps/eb/software/libpng/1.6.34-GCCcore-7.3.0/include/ ..

-- The C compiler identification is Intel 18.0.3.20180410 -- The CXX compiler identification is Intel 18.0.3.20180410 -- Check for working C compiler: /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icc -- Check for working C compiler: /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icpc -- Check for working CXX compiler: /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icpc -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- BUILD TYPE set to the default type: 'Release' -- Using provided CUDA_ARCH=70 -- ALTCPU enabled - Building CPU-accelerated version of RELION -- Setting cpu precision to double -- Setting accelerated code precision to single -- Using non-cuda compilation.... -- Found TBB: /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/tbb/include (found version "2018.0")
-- TBB_FOUND : TRUE -- TBB_INCLUDE_DIRS : /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/tbb/include -- TBB_VERSION : 2018.0 -- TBB_LIBRARIES : /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/tbb/lib/intel64/gcc4.4/libtbb.so -- Found MPI_C: /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/release_mt/libmpi.so (found version "3.1") -- Found MPI_CXX: /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpicxx.so (found version "3.1") -- Found MPI: TRUE (found version "3.1")
-- MPI_INCLUDE_PATH : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include -- MPI_LIBRARIES : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpicxx.so;/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/release_mt/libmpi.so;/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpigi.a;/lib64/libdl.so;/lib64/librt.so;/lib64/libpthread.so -- MPI_CXX_INCLUDE_PATH : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include -- MPI_CXX_LIBRARIES : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpicxx.so;/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/release_mt/libmpi.so;/camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/lib/libmpigi.a;/lib64/libdl.so;/lib64/librt.so;/lib64/libpthread.so -- CMAKE_C_COMPILER : /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icc -- CMAKE_CXX_COMPILER : /camp/apps/eb/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/icpc -- MPI_C_COMPILER : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/bin/mpiicc -- MPI_CXX_COMPILER : /camp/apps/eb/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/bin/mpiicpc -- CMAKE_CXX_COMPILER_ID : Intel -- MKL FFTW wrapper header files: /camp/apps/eb/software/imkl/2018.3.222-iimpi-2018b/compilers_and_libraries_2018.3.222/linux/mkl/include/fftw -- Texture interpolation is omitted. -- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so -- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so - found -- Looking for gethostbyname -- Looking for gethostbyname - found -- Looking for connect -- Looking for connect - found -- Looking for remove -- Looking for remove - found -- Looking for shmat -- Looking for shmat - found -- Found X11: /usr/lib64/libX11.so -- Could NOT find FLTK (missing: FLTK_LIBRARIES FLTK_INCLUDE_DIR) -- No FLTK installation was found -- Found previously built non-system FLTK libraries that will be used. -- FLTK_INCLUDE_DIR: /camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/external/fltk/include -- FLTK_LIBRARIES: /camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/external/fltk/lib/libfltk.so -- Looking for sincos -- Looking for sincos - found -- Looking for sincos -- Looking for sincos - not found -- Found TIFF: /camp/apps/eb/software/LibTIFF/4.0.9-GCCcore-6.4.0/lib/libtiff.so.5 (found version "4.0.9") -- Found ZLIB: /usr/lib64/libz.so (found version "1.2.7") -- Found PNG: /camp/apps/eb/software/libpng/1.6.34-GCCcore-7.3.0/lib/libpng16.so (found version "1.5.13") -- Checking class ranker model file... -- Found local copy of class ranker model BUILD_SHARED_LIBS = OFF -- Building static libs (larger build size and binaries) Running apps/CMakeLists.txt... -- CMAKE_BINARY_DIR:/camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/build -- Git commit ID: 138b9c71b41a3c8782c991053a28bfe074bffdd9 PNG FOUND -- Found JPEG: /camp/apps/eb/software/libjpeg-turbo/2.0.0-GCCcore-7.3.0/lib64/libjpeg.so.8 (found version "") JPEG FOUND -- Found OpenMP_C: -qopenmp (found version "5.0") -- Found OpenMP_CXX: -qopenmp (found version "5.0") -- Found OpenMP: TRUE (found version "5.0")
-- Configuring done -- Generating done -- Build files have been written to: /camp/apps/misc/stp/sbstp/relion-4.0_cpu_test/build

biochem-fan commented 1 year ago

ml intel/2018b ml icc/2018.3.222-GCC-7.3.0-2.30

Can you try a newer Intel compiler or GCC?

The latest (classical) intel compiler is freely available as part of Intel oneAPI HPC toolkit at https://www.intel.com/content/www/us/en/developer/tools/oneapi/hpc-toolkit-download.html.

do-jason commented 1 year ago

@andreanans It looks like used compiler is old and cannot understand OpenMP simd directive for array. I think you have two options to try.

  1. Try to use recent compiler as mentioned by @biochem-fan
  2. Replace it with "#pragma omp simd reduction(+:diffi)" or remove that line. Removing this line may degrade your performance a lot if you have AVX512-capable host.
andreanans commented 1 year ago

I tried with intel/2019.3 and seems to work! However, now I'm getting icpc: command line warning overriding -xCORE-AVX2 with -march=haswell errors. I will try the more recent Intel one API compilers...

UPDATE: even with 2022 intel-oneapi toolkit, I'm still getting the "overriding -xCORE-AVX2 with -march=haswell errors". Any advice? Do I just remove -xCORE-AVX2 ? I don't recall getting this error ever

UPDATE2: I've removed the -xCORE-AVX2 and am using -march=native. The errors have gone away- was this the correct thing to do?

do-jason commented 1 year ago

@andreanans "-x" is for Intel processor specific optimization and "-march" is for broad support for processor family. You do not need to specify both. If you are running on Intel processor, then it is advised to use "-x" only. If you want to run on Intel and other AVX2 capable processors, then you should use "-march" only.

andreanans commented 1 year ago

Ah, got it! Thanks for the info. It compiled properly w/o any errors.