if CUDA is found REG_TEST_CPLUSPLUS fails

paskino commented 4 years ago

If CUDA is found and NIFTYREG is built with CUDA, in compilation of REG_TEST_CPLUSPLUS we are missing -lcuda -lcudart -L/dir/to/cuda

CUDA_LIBRARIES is not the appropriate variable. Which one would be ok?

paskino commented 4 years ago

Having fixed somehow the above error, there comes the next:

make[5]: Entering directory '/mnt/data/PETMR/buildVM_202002/builds/SIRF/build'
[ 48%] Linking CXX shared module _pyreg.so
cd /home/edo/GitHub/PETMR/buildVM/builds/SIRF/build/src/Registration/pReg && /home/edo/miniconda3/envs/cilsirf_operator/bin/cmake -E cmake_link_script CMakeFiles/_pyreg.dir/link.txt --verbose=1
/opt/rh/devtoolset-7/root/usr/bin/c++ -fPIC -O3 -DNDEBUG  -shared  -o _pyreg.so CMakeFiles/_pyreg.dir/CMakeFiles/_pyreg.dir/pyregPYTHON_wrap.cxx.o  -Wl,-rpath,/home/edo/miniconda3/envs/cilsirf_operator/lib:/mnt/data/PETMR/install/lib: ../cReg/libReg.a /home/edo/miniconda3/envs/cilsirf_operator/lib/libpython3.6m.so ../../common/libcsirf.a ../../iUtilities/libiutilities.a ../NiftyMoMo/libNiftyMoMo.a /mnt/data/PETMR/install/lib/libreg_nifti.a /mnt/data/PETMR/install/lib/libz.a /mnt/data/PETMR/install/lib/libreg_png.a /mnt/data/PETMR/install/lib/lib_reg_ReadWriteImage.a /mnt/data/PETMR/install/lib/lib_reg_common_cuda.a /mnt/data/PETMR/install/lib/lib_reg_cuda_kernels.a /mnt/data/PETMR/install/lib/lib_reg_cudainfo.a /mnt/data/PETMR/install/lib/lib_reg_maths.a /mnt/data/PETMR/install/lib/lib_reg_tools.a /mnt/data/PETMR/install/lib/lib_reg_globalTrans.a /mnt/data/PETMR/install/lib/lib_reg_localTrans.a /mnt/data/PETMR/install/lib/lib_reg_measure.a /mnt/data/PETMR/install/lib/lib_reg_resampling.a /mnt/data/PETMR/install/lib/lib_reg_blockMatching.a /mnt/data/PETMR/install/lib/lib_reg_femTrans.a /mnt/data/PETMR/install/lib/lib_reg_aladin.a /mnt/data/PETMR/install/lib/lib_reg_f3d.a /mnt/data/PETMR/install/lib/libreg_nifti.a /mnt/data/PETMR/install/lib/libz.a /mnt/data/PETMR/install/lib/libreg_png.a /mnt/data/PETMR/install/lib/lib_reg_ReadWriteImage.a /mnt/data/PETMR/install/lib/lib_reg_common_cuda.a /mnt/data/PETMR/install/lib/lib_reg_cuda_kernels.a /mnt/data/PETMR/install/lib/lib_reg_cudainfo.a /mnt/data/PETMR/install/lib/lib_reg_maths.a /mnt/data/PETMR/install/lib/lib_reg_tools.a /mnt/data/PETMR/install/lib/lib_reg_globalTrans.a /mnt/data/PETMR/install/lib/lib_reg_localTrans.a /mnt/data/PETMR/install/lib/lib_reg_measure.a /mnt/data/PETMR/install/lib/lib_reg_resampling.a /mnt/data/PETMR/install/lib/lib_reg_blockMatching.a /mnt/data/PETMR/install/lib/lib_reg_femTrans.a /mnt/data/PETMR/install/lib/lib_reg_aladin.a /mnt/data/PETMR/install/lib/lib_reg_f3d.a -lgomp -lpthread /mnt/data/PETMR/install/lib/libboost_filesystem.so /mnt/data/PETMR/install/lib/libboost_system.so 
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: /mnt/data/PETMR/install/lib/lib_reg_cuda_kernels.a(_reg_cuda_kernels_generated_affineDeformationKernel.cu.o): relocation R_X86_64_32 against symbol `_Z12affineKernelPfS_Pi5uint3mb' can not be used when making a shared object; recompile with -fPIC
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: /mnt/data/PETMR/install/lib/lib_reg_cuda_kernels.a(_reg_cuda_kernels_generated_blockMatchingKernel.cu.o): relocation R_X86_64_32 against symbol `_Z21blockMatchingKernel2DPfS_PiS_Pj' can not be used when making a shared object; recompile with -fPIC
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: /mnt/data/PETMR/install/lib/lib_reg_cuda_kernels.a(_reg_cuda_kernels_generated_resampleKernel.cu.o): relocation R_X86_64_32 against symbol `_Z15ResampleImage2DPfS_S_PiS_6ulong25uint35uint2fi' can not be used when making a shared object; recompile with -fPIC
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: /mnt/data/PETMR/install/lib/lib_reg_common_cuda.a(_reg_common_cuda_generated__reg_common_cuda.cu.o): relocation R_X86_64_32 against `.rodata._Z40cudaCommon_transferNiftiToNiftiOnDevice1IfEiPP11nifti_imageS1_.str1.8' can not be used when making a shared object; recompile with -fPIC
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

So NIFTYREG should be built with set(POSITION_INDEPENDENT_CODE ON).

paskino commented 4 years ago

Actually I added manually set(CMAKE_POSITION_INDEPENDENT_CODE ON) in NIFTYREG CMakeLists.txt but it didn't help.

So currently I had it working with cmake ../SIRF-SuperBuild -DNIFTYREG_USE_CUDA=OFF

rijobro commented 4 years ago

I've pushed a fix to master, can you let me know if that works. I still can't get _pyreg.so to compile. Error:

/usr/bin/ld: /home/rich/Documents/Code/SIRF-SuperBuild/Install/lib/lib_reg_cuda_kernels.a
(_reg_cuda_kernels_generated_affineDeformationKernel.cu.o): 
relocation R_X86_64_PC32 against symbol `_Z12affineKernelPfS_Pi5uint3mb' 
can not be used when making a shared object; 
recompile with -fPIC

I've seen online that: "the error means that you can't use a static library to be linked w/ a dynamic one". But if I try to switch shared libraries on in NiftyReg, I get the warning: "CUDA is not compatible with shared libraries. Forcing BUILD_SHARED_LIBS to OFF". And I don't think that swig will build static libraries. This means that, as far as I'm aware, we'll have static CUDA libraries, and dynamic SIRF python libraries, which according to this error are incompatible. Not sure what the solution there is...

KrisThielemans commented 4 years ago

On linux, -fPIC is required for shared libraries, but you can have static libraries compiled with -fPIC. This is necessary for the shared library to call the code.

Edo's set(CMAKE_POSITION_INDEPENDENT_CODE ON) is the CMake way to get this to work with any compiler.

I don't know why it didn't work. @paskino, can you post some errors?

rijobro commented 4 years ago

Same as @paskino , I added set(CMAKE_POSITION_INDEPENDENT_CODE ON) to NiftyReg and rebuilt it. No errors there.

Then I rebuilt SIRF, which hits the same error as above when linking the first dynamic library that builds against NiftyReg. Fails to link against lib_reg_cuda_kernels.a. Normally Reg is static, so it fails for _pyreg, but if I set Reg to be dynamic, it fails for that one instead.

Could the problem be something specific to CUDA?

rijobro commented 4 years ago

Ah, the CMAKE_POSITION_INDEPENDENT_CODE option doesn't get passed to the NVCC compiler. Modifying this line: https://github.com/KCL-BMEIS/niftyreg/blob/99d584e2b8ea0bffe7e65e40c8dc818751782d92/reg-lib/cuda/CMakeLists.txt#L47 to:

set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_${CAPABILITY_CODE},code=sm_${CAPABILITY_CODE} --compiler-options -fPIC")

solves the problem.

How do you think it would be best to add this as a PR to NiftyReg? Have an option, which if true, appends the -fPIC to the NVCC compiler? What would you call the option, and what should the default be?

KrisThielemans commented 4 years ago

The root cause here seems to be that NiftyReg's CMakeLists.txt is using the outdated FindCUDA support. Since CMake 3.8, CUDA is a "first class language". See also https://devblogs.nvidia.com/building-cuda-applications-cmake/ In particular, that states

CMake 3.8 supports the POSITION_INDEPENDENT_CODE property for CUDA compilation, and builds all host-side code as relocatable when requested.

However, I guess there will be little enthusiasm to revise that file completely. A work-around seems to be something like

 if(CMAKE_POSITION_INDEPENDENT_CODE)
       # add (undocumented) CMake flag that should tell the host compiler to generate position independent code
        set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} ${CMAKE_C_COMPILE_OPTIONS_PIC}")
endif()

Not sure if this will work on Windows.

rijobro commented 4 years ago

Not sure if this will work on Windows.

I created an issue https://github.com/KCL-BMEIS/niftyreg/issues/66, suggesting to put:

if independent code
  if linux
   nvcc flags
  endif
endif

KrisThielemans commented 2 years ago

should be fixed if update our NiftyReg version: https://github.com/KCL-BMEIS/niftyreg/pull/67

SyneRBI / SIRF

if CUDA is found REG_TEST_CPLUSPLUS fails #551