Open chenfucn opened 2 years ago
From official document:
The library is thread safe and its functions can be called from multiple host threads, even with the same handle. When multiple threads share the same handle, extreme care needs to be taken when the handle configuration is changed because that change will affect potentially subsequent cuBLAS calls in all threads. It is even more true for the destruction of the handle. So it is not recommended that multiple thread share the same cuBLAS handle.
thread safe is only guaranteed when they don't share handle.
Thanks! But why some of the cusparselt calls are not protected and some are?
All GEMMs have mutexes except https://github.com/NVIDIA/FasterTransformer/blob/main/src/fastertransformer/utils/cublasMMWrapper.cc#L358, which is not used now. We will fix it ASAP.
It seems to me all cublas and cublaslt gemm and matmul operations are protected by a mutex. cublas library documents says its thread safe. Why is it necessary to protect these operations with mutex?