NVIDIA / CUDALibrarySamples

CUDA Library Samples
Other
1.5k stars 311 forks source link

compile error cusparselt #186

Closed yazdanbakhsh closed 2 months ago

yazdanbakhsh commented 3 months ago

Cuda compilation tools, release 12.4, V12.4.131 20.04.1-Ubuntu cuSPARSELt 0.6.1 NVIDIA A100-SXM4-40GB Driver Version: 550.54.15

I am trying to compile the code in matmul, but keep getting the following error:

matmul_example.cpp: In function ‘int main()’:
matmul_example.cpp:308:67: error: ambiguous overload for ‘operator*’ (operand types are ‘float’ and ‘__half’)
  308 |             hC_result[posC] = static_cast<C_t>(alpha * sum + beta * hC[posC]);  // [i][j]
      |                                                              ~~~~ ^ ~~~~~~~~
      |                                                              |             |
      |                                                              float         __half
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, int)’ <built-in>
  308 |             hC_result[posC] = static_cast<C_t>(alpha * sum + beta * hC[posC]);  // [i][j]
      |                                                              ~~~~~^~~~~~~~~~
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, long long unsigned int)’ <built-in>
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, long long int)’ <built-in>
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, long unsigned int)’ <built-in>
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, long int)’ <built-in>
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, unsigned int)’ <built-in>
matmul_example.cpp:308:67: note: candidate: ‘operator*(float, float)’ <built-in>
In file included from /usr/local/cuda/include/cuda_fp16.h:4813,
                 from /usr/include/cusparse.h:53,
                 from /usr/include/cusparseLt.h:13,
                 from matmul_example.cpp:12:
/usr/local/cuda/include/cuda_fp16.hpp:220:54: note: candidate: ‘__half operator*(const __half&, const __half&)’
  220 | __CUDA_HOSTDEVICE__ __CUDA_FP16_FORCEINLINE__ __half operator*(const __half &lh, const __half &rh) { return __hmul(lh, rh); }
      |                                                      ^~~~~~~~
make: *** [Makefile:26: matmul_example] Error 1
yazdanbakhsh commented 3 months ago

For now, I removed the host compare so I can run the CUDA part. Only the static part works tho, why it is the case?

./matmul_example -> ERROR

 ** On entry to cusparseLtMatmulDescriptorInit(): matrix type/compute type combination is not supported, current: IN=CUDA_R_16BF, OUT=CUDA_R_16BF, COMPUTE=COMPUTE_TF32

CUSPARSE API failed at line 206 with error: operation not supported (10)

./matmul_example_static -> WORKED (still removing host comparison part)

fbusato commented 3 months ago

@yazdanbakhsh could you please share the command line that you used?

j4yan commented 3 months ago

@yazdanbakhsh I was able to reproduce the compile error and we make the fix soon. Thanks for catching it. The fix will be simply changing line 308 to hC_result[posC] = static_cast<C_t>(alpha * sum + beta * static_cast<float>(hC[posC])); // [i][j]

Regarding ./matmul_example -> ERROR, could you set enviroment CUSPARSELT_LOG_LEVEL=5 and rerun it?