OpenBMB / BMInf

Efficient Inference for Big Models
Apache License 2.0
572 stars 67 forks source link

[BUG]RuntimeError: cublas error: CUBLAS_STATUS_NOT_SUPPORTED #52

Open cuishibin opened 2 years ago

cuishibin commented 2 years ago

Describe the bug 使用docker环境,运行三个demo时后台都报错误 File "/usr/local/lib/python3.6/dist-packages/bminf/arch/t5/model.py", line 238, in encode True File "/usr/local/lib/python3.6/dist-packages/bminf/layers/transformer_block.py", line 42, in forward x = self.self_attention.forward(allocator, x, attention_mask, self_attn_position_bias) File "/usr/local/lib/python3.6/dist-packages/bminf/layers/attention.py", line 63, in forward qkv_i32 File "/usr/local/lib/python3.6/dist-packages/bminf/functions/gemm.py", line 86, in igemm _igemm(allocator, a, aT, b, bT, c, device, stream) File "/usr/local/lib/python3.6/dist-packages/bminf/functions/gemm.py", line 265, in _igemm stream.ptr File "/usr/local/lib/python3.6/dist-packages/bminf/backend/cublaslt.py", line 101, in checkCublasStatus raise RuntimeError("cublas error: %s" % cublas_errors[cublas_status]) RuntimeError: cublas error: CUBLAS_STATUS_NOT_SUPPORTED

请问是什么原因,是哪个版本有问题吗?

Environment: cuda:10.1 模型:EVA-int8 显存:12G