src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
791 stars 145 forks source link

Any idea why the following errors occur when make? #25

Closed yxchng closed 6 years ago

yxchng commented 6 years ago

-- Found OpenMP_C: -fopenmp
-- Found OpenMP_CXX: -fopenmp
-- Could NOT find R (missing: R_EXECUTABLE R_INCLUDE_DIR R_LIBRARY) -- Configuring done You have changed variables that require your cache to be deleted. Configure will be re-run and you may have to reset some variables. The following variables have changed: CMAKE_C_COMPILER= /usr/bin/gcc-4.8 CMAKE_CXX_COMPILER= /usr/bin/g++-4.8

-- The C compiler identification is GNU 4.8.5 -- The CXX compiler identification is GNU 4.8.5 -- Check for working C compiler: /usr/bin/gcc-4.8 -- Check for working C compiler: /usr/bin/gcc-4.8 -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/g++-4.8 -- Check for working CXX compiler: /usr/bin/g++-4.8 -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found OpenMP_C: -fopenmp (found version "3.1") -- Found OpenMP_CXX: -fopenmp (found version "3.1") -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.6.2", minimum required is "3") -- Found PythonLibs: /usr/lib/libpython3.6m.so (found suitable version "3.6.2", minimum required is "3") -- Could NOT find R (missing: R_EXECUTABLE R_INCLUDE_DIR R_LIBRARY) -- Configuring done -- Generating done -- Build files have been written to: /home/yxchng/git/kmcuda/src [ 16%] Building NVCC (Device) object CMakeFiles/KMCUDA.dir/KMCUDA_generated_transpose.cu.o /home/yxchng/git/kmcuda/src/private.h(280): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(79): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(79): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(86): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(89): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(89): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(92): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(92): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(106): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(115): error: identifier "printf" is undefined

/home/yxchng/git/kmcuda/src/transpose.cu(115): error: identifier "printf" is undefined

11 errors detected in the compilation of "/tmp/tmpxft_00000581_00000000-5_transpose.cpp4.ii". CMake Error at KMCUDA_generated_transpose.cu.o.cmake:282 (message): Error generating file /home/yxchng/git/kmcuda/src/CMakeFiles/KMCUDA.dir//./KMCUDA_generated_transpose.cu.o

vmarkovtsev commented 6 years ago

Hi! Let's investigate.

What's your NVCC version / CUDA SDK version?

Try to put #include <stdio.h> in private.h and see if it helps.

yxchng commented 6 years ago

I am using cuda 8.0 and cudnn v5.1.

After adding #include , I face a new problem:

[yxchng@work src]$ cmake -DCMAKE_C_COMPILER=/usr/bin/gcc-4.8 -DCMAKE_C_FLAGS=-fopenmp -DCMAKE_CXX_COMPILER=/usr/bin/g++-4.8 .cc-4.8 -DCMAKE_C_FLAGS=-fopenmp -DCMA
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Check for working C compiler: /usr/bin/gcc-4.8
-- Check for working C compiler: /usr/bin/gcc-4.8 -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/g++-4.8
-- Check for working CXX compiler: /usr/bin/g++-4.8 -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "3.1") 
-- Found OpenMP_CXX: -fopenmp (found version "3.1") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.6.2", minimum required is "3") 
-- Found PythonLibs: /usr/lib/libpython3.6m.so (found suitable version "3.6.2", minimum required is "3") 
-- Could NOT find R (missing: R_EXECUTABLE R_INCLUDE_DIR R_LIBRARY) 
-- Configuring done
-- Generating done
-- Build files have been written to: /home/yxchng/git/kmcuda/src
[yxchng@work src]$ make
[ 16%] Building NVCC (Device) object CMakeFiles/KMCUDA.dir/KMCUDA_generated_transpose.cu.o
ptxas info    : 16 bytes gmem
ptxas info    : Compiling entry function '_Z9transposeILb1EEvPKfjjPf' for 'sm_61'
ptxas info    : Function properties for _Z9transposeILb1EEvPKfjjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 20 registers, 4224 bytes smem, 344 bytes cmem[0]
ptxas info    : Compiling entry function '_Z13copy_sample_tjjtPKfPf' for 'sm_61'
ptxas info    : Function properties for _Z13copy_sample_tjjtPKfPf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 6 registers, 352 bytes cmem[0]
ptxas info    : Compiling entry function '_Z9transposeILb0EEvPKfjjPf' for 'sm_61'
ptxas info    : Function properties for _Z9transposeILb0EEvPKfjjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 20 registers, 4224 bytes smem, 344 bytes cmem[0]
[ 33%] Building NVCC (Device) object CMakeFiles/KMCUDA.dir/KMCUDA_generated_kmeans.cu.o
ptxas info    : 77793 bytes gmem, 96 bytes cmem[3]
ptxas info    : Compiling entry function '_Z19kmeans_assign_lloydIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PjS4_' for 'sm_61'
ptxas info    : Function properties for _Z19kmeans_assign_lloydIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PjS4_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 360 bytes cmem[0]
ptxas info    : Compiling entry function '_Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric0EfEvjjjPKT0_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric0EfEvjjjPKT0_PfPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjPKfS5_PjPfS8_' for 'sm_61'
ptxas info    : Function properties for _Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjPKfS5_PjPfS8_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 392 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z14kmeans_yy_initIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjS5_Pf' for 'sm_61'
ptxas info    : Function properties for _Z14kmeans_yy_initIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjS5_Pf
    24 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 368 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_Pf' for 'sm_61'
ptxas info    : Function properties for _Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 360 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjPd' for 'sm_61'
ptxas info    : Function properties for _Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PKjPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PjS4_' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric0EfEvjjPKT0_S3_PjS4_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0]
ptxas info    : Compiling entry function '_Z16kmeans_plus_plusIL20KMCUDADistanceMetric0EfEvjjjPKT0_S3_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z16kmeans_plus_plusIL20KMCUDADistanceMetric0EfEvjjjPKT0_S3_PfPd
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 368 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric0EfEvjjPKT0_PS1_' for 'sm_61'
ptxas info    : Function properties for _Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric0EfEvjjPKT0_PS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 23 registers, 344 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z13kmeans_adjustIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS5_PS1_Pj' for 'sm_61'
ptxas info    : Function properties for _Z13kmeans_adjustIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS5_PS1_Pj
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 368 bytes cmem[0], 20 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_Pf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_S5_PKfPjPf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric0EfEvjjPKT0_PKjS3_S5_PKfPjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 384 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PjS4_' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PjS4_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16kmeans_plus_plusIL20KMCUDADistanceMetric1EfEvjjjPKT0_S3_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z16kmeans_plus_plusIL20KMCUDADistanceMetric1EfEvjjjPKT0_S3_PfPd
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 368 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric1EfEvjjPKT0_PS1_' for 'sm_61'
ptxas info    : Function properties for _Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric1EfEvjjPKT0_PS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 25 registers, 344 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z13kmeans_adjustIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS5_PS1_Pj' for 'sm_61'
ptxas info    : Function properties for _Z13kmeans_adjustIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS5_PS1_Pj
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 38 registers, 368 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_Pf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 29 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_S5_PKfPjPf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_S5_PKfPjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 384 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z19kmeans_assign_lloydIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PjS4_' for 'sm_61'
ptxas info    : Function properties for _Z19kmeans_assign_lloydIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PjS4_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric1EfEvjjjPKT0_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric1EfEvjjjPKT0_PfPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjPKfS5_PjPfS8_' for 'sm_61'
ptxas info    : Function properties for _Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjPKfS5_PjPfS8_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 392 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_Pf' for 'sm_61'
ptxas info    : Function properties for _Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric1EfEvjjPKT0_PKjS3_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjPd' for 'sm_61'
ptxas info    : Function properties for _Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z25kmeans_afkmc2_random_stepjmmPKfPjPf' for 'sm_61'
ptxas info    : Function properties for _Z25kmeans_afkmc2_random_stepjmmPKfPjPf
    6448 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 32768 bytes smem, 368 bytes cmem[0], 8 bytes cmem[2]
ptxas info    : Compiling entry function '_Z14kmeans_yy_initIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjS5_Pf' for 'sm_61'
ptxas info    : Function properties for _Z14kmeans_yy_initIL20KMCUDADistanceMetric1EfEvjjPKT0_S3_PKjS5_Pf
    24 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 368 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PjS5_' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PjS5_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16kmeans_plus_plusIL20KMCUDADistanceMetric0E7__half2EvjjjPKT0_S4_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z16kmeans_plus_plusIL20KMCUDADistanceMetric0E7__half2EvjjjPKT0_S4_PfPd
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 368 bytes cmem[0], 8 bytes cmem[2]
ptxas info    : Compiling entry function '_Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PS2_' for 'sm_61'
ptxas info    : Function properties for _Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PS2_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 24 registers, 344 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z13kmeans_adjustIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS6_PS2_Pj' for 'sm_61'
ptxas info    : Function properties for _Z13kmeans_adjustIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS6_PS2_Pj
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 39 registers, 368 bytes cmem[0], 16 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_Pf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_S6_PKfPjPf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_S6_PKfPjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 39 registers, 384 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z19kmeans_assign_lloydIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PjS5_' for 'sm_61'
ptxas info    : Function properties for _Z19kmeans_assign_lloydIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PjS5_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric0E7__half2EvjjjPKT0_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric0E7__half2EvjjjPKT0_PfPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjPKfS6_PjPfS9_' for 'sm_61'
ptxas info    : Function properties for _Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjPKfS6_PjPfS9_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 392 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_Pf' for 'sm_61'
ptxas info    : Function properties for _Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_PKjS4_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjPd' for 'sm_61'
ptxas info    : Function properties for _Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z20kmeans_afkmc2_calc_qjjfPf' for 'sm_61'
ptxas info    : Function properties for _Z20kmeans_afkmc2_calc_qjjfPf
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 12 registers, 344 bytes cmem[0], 20 bytes cmem[2]
ptxas info    : Compiling entry function '_Z14kmeans_yy_initIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjS6_Pf' for 'sm_61'
ptxas info    : Function properties for _Z14kmeans_yy_initIL20KMCUDADistanceMetric0E7__half2EvjjPKT0_S4_PKjS6_Pf
    24 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 368 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PjS5_' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_assign_lloyd_smallcIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PjS5_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 28 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16kmeans_plus_plusIL20KMCUDADistanceMetric1E7__half2EvjjjPKT0_S4_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z16kmeans_plus_plusIL20KMCUDADistanceMetric1E7__half2EvjjjPKT0_S4_PfPd
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 368 bytes cmem[0], 28 bytes cmem[2]
ptxas info    : Compiling entry function '_Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PS2_' for 'sm_61'
ptxas info    : Function properties for _Z21kmeans_yy_calc_driftsIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PS2_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 26 registers, 344 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z13kmeans_adjustIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS6_PS2_Pj' for 'sm_61'
ptxas info    : Function properties for _Z13kmeans_adjustIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS6_PS2_Pj
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 39 registers, 368 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_Pf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_afkmc2_min_distIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_S6_PKfPjPf' for 'sm_61'
ptxas info    : Function properties for _Z22kmeans_yy_local_filterIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_S6_PKfPjPf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 39 registers, 384 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z19kmeans_assign_lloydIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PjS5_' for 'sm_61'
ptxas info    : Function properties for _Z19kmeans_assign_lloydIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PjS5_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 360 bytes cmem[0], 28 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric1E7__half2EvjjjPKT0_PfPd' for 'sm_61'
ptxas info    : Function properties for _Z26kmeans_afkmc2_calc_q_distsIL20KMCUDADistanceMetric1E7__half2EvjjjPKT0_PfPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 360 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjPKfS6_PjPfS9_' for 'sm_61'
ptxas info    : Function properties for _Z23kmeans_yy_global_filterIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjPKfS6_PjPfS9_
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 392 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_Pf' for 'sm_61'
ptxas info    : Function properties for _Z33kmeans_afkmc2_min_dist_transposedIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_PKjS4_Pf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 360 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjPd' for 'sm_61'
ptxas info    : Function properties for _Z28kmeans_calc_average_distanceIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjPd
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 360 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z31kmeans_yy_find_group_max_driftsjjPKjPf' for 'sm_61'
ptxas info    : Function properties for _Z31kmeans_yy_find_group_max_driftsjjPKjPf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 18 registers, 344 bytes cmem[0]
ptxas info    : Compiling entry function '_Z14kmeans_yy_initIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjS6_Pf' for 'sm_61'
ptxas info    : Function properties for _Z14kmeans_yy_initIL20KMCUDADistanceMetric1E7__half2EvjjPKT0_S4_PKjS6_Pf
    24 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 368 bytes cmem[0], 32 bytes cmem[2]
[ 50%] Building NVCC (Device) object CMakeFiles/KMCUDA.dir/KMCUDA_generated_knn.cu.o
ptxas info    : 8 bytes gmem, 12 bytes cmem[3]
ptxas info    : Compiling entry function '_Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric0EfEvjPKT0_Pf' for 'sm_61'
ptxas info    : Function properties for _Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric0EfEvjPKT0_Pf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 36 registers, 49152 bytes smem, 344 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16knn_assign_shmemIL20KMCUDADistanceMetric0EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj' for 'sm_61'
ptxas info    : Function properties for _Z16knn_assign_shmemIL20KMCUDADistanceMetric0EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric0EfEvjjPKjS2_PKT0_S5_PfS6_' for 'sm_61'
ptxas info    : Function properties for _Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric0EfEvjjPKjS2_PKT0_S5_PfS6_
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 32768 bytes smem, 376 bytes cmem[0], 16 bytes cmem[2]
ptxas info    : Compiling entry function '_Z15knn_assign_gmemIL20KMCUDADistanceMetric0EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj' for 'sm_61'
ptxas info    : Function properties for _Z15knn_assign_gmemIL20KMCUDADistanceMetric0EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric1EfEvjjPKjS2_PKT0_S5_PfS6_' for 'sm_61'
ptxas info    : Function properties for _Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric1EfEvjjPKjS2_PKT0_S5_PfS6_
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 32768 bytes smem, 376 bytes cmem[0], 36 bytes cmem[2]
ptxas info    : Compiling entry function '_Z15knn_assign_gmemIL20KMCUDADistanceMetric1EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj' for 'sm_61'
ptxas info    : Function properties for _Z15knn_assign_gmemIL20KMCUDADistanceMetric1EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric1EfEvjPKT0_Pf' for 'sm_61'
ptxas info    : Function properties for _Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric1EfEvjPKT0_Pf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 39 registers, 49152 bytes smem, 344 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16knn_assign_shmemIL20KMCUDADistanceMetric1EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj' for 'sm_61'
ptxas info    : Function properties for _Z16knn_assign_shmemIL20KMCUDADistanceMetric1EfEvjjtPKfS2_PKT0_S5_PKjS7_S7_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z29knn_assign_gmem_deinterleave1jtPj' for 'sm_61'
ptxas info    : Function properties for _Z29knn_assign_gmem_deinterleave1jtPj
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 32 registers, 336 bytes cmem[0], 8 bytes cmem[2]
ptxas info    : Compiling entry function '_Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric0E7__half2EvjjPKjS3_PKT0_S6_PfS7_' for 'sm_61'
ptxas info    : Function properties for _Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric0E7__half2EvjjPKjS3_PKT0_S6_PfS7_
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 30 registers, 32768 bytes smem, 376 bytes cmem[0], 16 bytes cmem[2]
ptxas info    : Compiling entry function '_Z15knn_assign_gmemIL20KMCUDADistanceMetric0E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj' for 'sm_61'
ptxas info    : Function properties for _Z15knn_assign_gmemIL20KMCUDADistanceMetric0E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric0E7__half2EvjPKT0_Pf' for 'sm_61'
ptxas info    : Function properties for _Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric0E7__half2EvjPKT0_Pf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 49152 bytes smem, 344 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16knn_assign_shmemIL20KMCUDADistanceMetric0E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj' for 'sm_61'
ptxas info    : Function properties for _Z16knn_assign_shmemIL20KMCUDADistanceMetric0E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 12 bytes cmem[2]
ptxas info    : Compiling entry function '_Z28knn_mirror_cluster_distancesPf' for 'sm_61'
ptxas info    : Function properties for _Z28knn_mirror_cluster_distancesPf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 27 registers, 328 bytes cmem[0], 4 bytes cmem[2]
ptxas info    : Compiling entry function '_Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric1E7__half2EvjjPKjS3_PKT0_S6_PfS7_' for 'sm_61'
ptxas info    : Function properties for _Z25knn_calc_cluster_radiusesIL20KMCUDADistanceMetric1E7__half2EvjjPKjS3_PKT0_S6_PfS7_
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 32768 bytes smem, 376 bytes cmem[0], 36 bytes cmem[2]
ptxas info    : Compiling entry function '_Z15knn_assign_gmemIL20KMCUDADistanceMetric1E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj' for 'sm_61'
ptxas info    : Function properties for _Z15knn_assign_gmemIL20KMCUDADistanceMetric1E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric1E7__half2EvjPKT0_Pf' for 'sm_61'
ptxas info    : Function properties for _Z26knn_calc_cluster_distancesIL20KMCUDADistanceMetric1E7__half2EvjPKT0_Pf
    16 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 49152 bytes smem, 344 bytes cmem[0], 24 bytes cmem[2]
ptxas info    : Compiling entry function '_Z16knn_assign_shmemIL20KMCUDADistanceMetric1E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj' for 'sm_61'
ptxas info    : Function properties for _Z16knn_assign_shmemIL20KMCUDADistanceMetric1E7__half2EvjjtPKfS3_PKT0_S6_PKjS8_S8_Pj
    32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 40 registers, 400 bytes cmem[0], 32 bytes cmem[2]
ptxas info    : Compiling entry function '_Z29knn_assign_gmem_deinterleave2jtPj' for 'sm_61'
ptxas info    : Function properties for _Z29knn_assign_gmem_deinterleave2jtPj
    8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 28 registers, 336 bytes cmem[0], 8 bytes cmem[2]
Scanning dependencies of target KMCUDA
[ 66%] Building CXX object CMakeFiles/KMCUDA.dir/kmcuda.cc.o
/home/yxchng/git/kmcuda/src/kmcuda.cc:206:0: error: ignoring #pragma omp simd [-Werror=unknown-pragmas]
       #pragma omp simd
 ^
/home/yxchng/git/kmcuda/src/kmcuda.cc:310:0: error: ignoring #pragma omp simd [-Werror=unknown-pragmas]
           #pragma omp simd reduction(+:dist_sum2)
 ^
cc1plus: all warnings being treated as errors
make[2]: *** [CMakeFiles/KMCUDA.dir/build.make:84: CMakeFiles/KMCUDA.dir/kmcuda.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:68: CMakeFiles/KMCUDA.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
vmarkovtsev commented 6 years ago

Right, so cuda files were successfully compiled, but c++ are not because gcc 4.8 is too old and does not support OpenMP 4.0 (really, it was released >4 years ago). Since it is not supported, you can comment those #pragma-s out as a workaround.

yxchng commented 6 years ago

I started using gcc 7.2 to compile and I get this:

[yxchng@work src]$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CXX_COMPILER=/usr/bin/g++ . && make
-- Found OpenMP_C: -fopenmp  
-- Found OpenMP_CXX: -fopenmp  
-- Could NOT find R (missing: R_EXECUTABLE R_INCLUDE_DIR R_LIBRARY) 
-- Configuring done
You have changed variables that require your cache to be deleted.
Configure will be re-run and you may have to reset some variables.
The following variables have changed:
CMAKE_C_COMPILER= /usr/bin/gcc
CMAKE_CXX_COMPILER= /usr/bin/g++

-- The C compiler identification is GNU 7.2.0
-- The CXX compiler identification is GNU 7.2.0
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/g++
-- Check for working CXX compiler: /usr/bin/g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.6.2", minimum required is "3") 
-- Found PythonLibs: /usr/lib/libpython3.6m.so (found suitable version "3.6.2", minimum required is "3") 
-- Could NOT find R (missing: R_EXECUTABLE R_INCLUDE_DIR R_LIBRARY) 
-- Configuring done
-- Generating done
-- Build files have been written to: /home/yxchng/git/kmcuda/src
[ 16%] Building NVCC (Device) object CMakeFiles/KMCUDA.dir/KMCUDA_generated_transpose.cu.o
In file included from /opt/cuda/include/cuda_runtime.h:78:0,
                 from <command-line>:0:
/opt/cuda/include/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 5 are not supported!
 #error -- unsupported GNU version! gcc versions later than 5 are not supported!
  ^~~~~
CMake Error at KMCUDA_generated_transpose.cu.o.cmake:222 (message):
  Error generating
  /home/yxchng/git/kmcuda/src/CMakeFiles/KMCUDA.dir//./KMCUDA_generated_transpose.cu.o

make[2]: *** [CMakeFiles/KMCUDA.dir/build.make:79: CMakeFiles/KMCUDA.dir/KMCUDA_generated_transpose.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:67: CMakeFiles/KMCUDA.dir/all] Error 2
make: *** [Makefile:84: all] Error 2

That's why I reverted to 4.8. Now I tried 5.4 and it compiles. Weird

vmarkovtsev commented 6 years ago

Yep, cuda 8.0 does not support newest gcc versions. I think gcc 4.9 will be OK too.