GPUPeople / spECK

Efficient SpGEMM on GPU using CUDA and CSR
MIT License
50 stars 16 forks source link

Error during building, with atomicAdd(uint64_t *, unsigned long), on V100 #7

Closed simple86 closed 2 years ago

simple86 commented 2 years ago

Experimental setup:

CUDA: 11.0
cmake: 3.18.6
gpu: V100
os: ubuntu 18.04

I've set COMPUTE_CAPABILITY="CC70" in linuxsetup.sh, set option(CUDA_BUILD_CC70 "Build with compute capability 7.0 support" TRUE) in CMakeLists.txt, set spECK_DYNAMIC_MEM_PER_BLOCK = 98304 in include/MUltiply.h. However, error occurs while building.

...
spECK/include/GPU/spECK_HashSpGEMM.cuh(1252): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (uint64_t *, unsigned long)
          detected during:
            instantiation of "void iterateMatrixDenseNumeric<INDEX_TYPE,VALUE_TYPE,MAX_ELEMENTS_BLOCK,SUPPORT_GLOBAL,THREADS,SHIFT,useRowOffsets>(INDEX_TYPE, INDEX_TYPE, uint32_t, INDEX_TYPE *, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, const INDEX_TYPE *, const INDEX_TYPE *, const INDEX_TYPE *, const VALUE_TYPE *, const VALUE_TYPE *, INDEX_TYPE *, VALUE_TYPE *, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE *, INDEX_TYPE *, INDEX_TYPE *, VALUE_TYPE *, void *) [with INDEX_TYPE=IndexType, VALUE_TYPE=uint64_t, MAX_ELEMENTS_BLOCK=2641U, SUPPORT_GLOBAL=false, THREADS=512U, SHIFT=9U, useRowOffsets=true]" 
(1400): here
            instantiation of "void denseSpGEMMNumericImplementation<INDEX_TYPE,VALUE_TYPE,GlobalRowOffsetsMap,SHARED_MEM_SIZE,SUPPORT_GLOBAL,THREADS>(INDEX_TYPE, INDEX_TYPE, const INDEX_TYPE *, const INDEX_TYPE *, const INDEX_TYPE *, const VALUE_TYPE *, const VALUE_TYPE *, GlobalRowOffsetsMap *, INDEX_TYPE, INDEX_TYPE *, VALUE_TYPE *, const INDEX_TYPE *, uint32_t *, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, __nv_bool, uint32_t) [with INDEX_TYPE=IndexType, VALUE_TYPE=uint64_t, GlobalRowOffsetsMap=HashMapNoValue<uint32_t, 1UL>, SHARED_MEM_SIZE=24192U, SUPPORT_GLOBAL=false, THREADS=512U]" 
(1772): here
            instantiation of "void spGEMMNumericLauncher<INDEX_TYPE,VALUE_TYPE,GlobalHashMap,GlobalRowOffsetMap,SHARED_MEM_SIZE,SUPPORT_GLOBAL,THREADS>(INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, INDEX_TYPE, const INDEX_TYPE *, const INDEX_TYPE *, const INDEX_TYPE *, const INDEX_TYPE *, const VALUE_TYPE *, const VALUE_TYPE *, GlobalHashMap *, INDEX_TYPE, GlobalRowOffsetMap *, INDEX_TYPE, INDEX_TYPE *, VALUE_TYPE *, const INDEX_TYPE *, INDEX_TYPE *, const INDEX_TYPE *, Config::SortModes, uint32_t, const INDEX_TYPE *, uint32_t *, uint32_t, __nv_bool, uint32_t) [with INDEX_TYPE=IndexType, VALUE_TYPE=uint64_t, GlobalHashMap=HashMap<uint32_t, uint64_t>, GlobalRowOffsetMap=HashMapNoValue<uint32_t, 1UL>, SHARED_MEM_SIZE=24192U, SUPPORT_GLOBAL=false, THREADS=512U]" 
(2026): here
            instantiation of "void spECKKernels::h_SpGEMMNumericLauncher<INDEX_TYPE,VALUE_TYPE,GlobalHashMap,GlobalRowOffsetMap,SHARED_HASH_SIZE,SUPPORT_GLOBAL,THREADS>(dCSRNoDealloc<VALUE_TYPE>, dCSRNoDealloc<VALUE_TYPE>, dCSRNoDealloc<VALUE_TYPE>, GlobalHashMap *, INDEX_TYPE, GlobalRowOffsetMap *, INDEX_TYPE, INDEX_TYPE *, INDEX_TYPE *, Config::SortModes, uint32_t, const INDEX_TYPE *, INDEX_TYPE *, uint32_t, __nv_bool, uint32_t) [with INDEX_TYPE=IndexType, VALUE_TYPE=uint64_t, GlobalHashMap=HashMap<uint32_t, uint64_t>, GlobalRowOffsetMap=HashMapNoValue<uint32_t, 1UL>, SHARED_HASH_SIZE=24192U, SUPPORT_GLOBAL=false, THREADS=512U]" 
/root/spamm_exp/spECK/source/GPU/Multiply.cu(891): here
            instantiation of "void spECK::MultiplyspECKImplementation<DataType,BLOCKS_PER_SM,THREADS_PER_BLOCK,MAX_DYNAMIC_SHARED,MAX_STATIC_SHARED>(const dCSR<DataType> &, const dCSR<DataType> &, dCSR<DataType> &, spECK::spECKConfig &, Timings &) [with DataType=uint64_t, BLOCKS_PER_SM=4, THREADS_PER_BLOCK=1024, MAX_DYNAMIC_SHARED=98304, MAX_STATIC_SHARED=49152]" 
/root/spamm_exp/spECK/source/GPU/Multiply.cu(1127): here
            instantiation of "void spECK::MultiplyspECK<DataType,BLOCKS_PER_SM,THREADS_PER_BLOCK,MAX_DYNAMIC_SHARED,MAX_STATIC_SHARED>(const dCSR<DataType> &, const dCSR<DataType> &, dCSR<DataType> &, spECK::spECKConfig &, Timings &) [with DataType=uint64_t, BLOCKS_PER_SM=4, THREADS_PER_BLOCK=1024, MAX_DYNAMIC_SHARED=98304, MAX_STATIC_SHARED=49152]" 
/root/spamm_exp/spECK/source/GPU/Multiply.cu(1132): here

Error limit reached.
100 errors detected in the compilation of "/root/spamm_exp/spECK/source/GPU/Multiply.cu".
Compilation terminated.
CMakeFiles/spECKLib.dir/build.make:172: recipe for target 'CMakeFiles/spECKLib.dir/source/GPU/Multiply.cu.o' failed
make[2]: *** [CMakeFiles/spECKLib.dir/source/GPU/Multiply.cu.o] Error 1
CMakeFiles/Makefile2:123: recipe for target 'CMakeFiles/spECKLib.dir/all' failed
make[1]: *** [CMakeFiles/spECKLib.dir/all] Error 2
Makefile:102: recipe for target 'all' failed
make: *** [all] Error 2

So, what seems to be the problem?

dabeschte commented 2 years ago

I disabled uint64_t values as a hot fix for now, as this seems to be causing a problem here. I honestly have no clue why this does not compile on linux - the code seems correct at first sight. In case you actually want to use uint64_t values, please let me know and I take a look at it. But I guess that most people use it for float/double value multiplications.

Please let me know if compilation works now.

simple86 commented 2 years ago

Thanks for your reply! I tried just now, but there are still errors from the convert function.

spECK/source/dCSR.cpp:111:15: error: template-id ‘convert<>’ for ‘void convert(dCSR<long unsigned int>&, const CSR<double>&, unsigned int)’ does not match any template declaration
 template void convert(dCSR<uint64_t>& dcsr, const CSR<double>& csr, unsigned int);
spECK/source/dCSR.cpp:115:15: error: template-id ‘convert<>’ for ‘void convert(CSR<long unsigned int>&, const dCSR<double>&, unsigned int)’ does not match any template declaration
 template void convert(CSR<uint64_t>& csr, const dCSR<double>& dcsr, unsigned int padding);
spECK/source/dCSR.cpp:119:15: error: template-id ‘convert<>’ for ‘void convert(dCSR<long unsigned int>&, const dCSR<double>&, unsigned int)’ does not match any template declaration
 template void convert(dCSR<uint64_t>& dcsr, const dCSR<double>& csr, unsigned int);
spECK/source/dCSR.cpp:123:15: error: template-id ‘convert<>’ for ‘void convert(CSR<long unsigned int>&, const CSR<double>&, unsigned int)’ does not match any template declaration
 template void convert(CSR<uint64_t>& csr, const CSR<double>& dcsr, unsigned int padding);
dabeschte commented 2 years ago

I am really surprised that this did not trigger an error on my system. But I think it is solved now. Please let me know.

simple86 commented 2 years ago

I commented the four lines of convert function, and it works well now. Thanks a lot!