CUDA error during inference

blumenstiel commented 8 months ago

I try to run the demo code but get a CUDA error from

streamer = chat.stream_answer(conv=chat_state,
                              img_list=img_list,
                              temperature=temperature,
                              max_new_tokens=500,
                              max_length=2000)

This is the error:

...
  File "venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/transformers/models/clip/modeling_clip.py", line 385, in forward
    hidden_states, attn_weights = self.self_attn(
                                  ^^^^^^^^^^^^^^^
  File "venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.11/site-packages/transformers/models/clip/modeling_clip.py", line 324, in forward
    attn_output = torch.bmm(attn_probs, value_states)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmStridedBatchedExFix( handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

I assume that the this error is related to the cuda and/or torch version. These are the relevant package and versions I installed (torch 2.0.1 with Coda 11.7):

nvidia-cublas-cu11        11.10.3.66
nvidia-cuda-cupti-cu11    11.7.101
nvidia-cuda-nvrtc-cu11    11.7.99
nvidia-cuda-runtime-cu11  11.7.99
nvidia-cudnn-cu11         8.5.0.96
nvidia-cufft-cu11         10.9.0.58
nvidia-curand-cu11        10.2.10.91
nvidia-cusolver-cu11      11.4.0.1
nvidia-cusparse-cu11      11.7.4.91
nvidia-nccl-cu11          2.14.3
nvidia-nvtx-cu11          11.7.91
torch                     2.0.1
torchvision               0.15.2
transformers              4.31.0

Can you share the version you are using? I tested three different version but always got errors.

KjAeRsTuIsK commented 8 months ago

nvidia-cublas-cu11        11.10.3.66              
nvidia-cuda-cupti-cu11    11.7.101                 
nvidia-cuda-nvrtc-cu11    11.7.99                  
nvidia-cuda-runtime-cu11  11.7.99               
nvidia-cudnn-cu11         8.5.0.96               
nvidia-cufft-cu11         10.9.0.58         
nvidia-curand-cu11        10.2.10.91         
nvidia-cusolver-cu11      11.4.0.1              
nvidia-cusparse-cu11      11.7.4.91            
nvidia-nccl-cu11          2.14.3         
nvidia-nvtx-cu11          11.7.91         
torch                     2.0.1              
torchvision               0.15.2   
transformers              4.31.0

Hi @blumenstiel, this is the version on which it runs on our system. They are identical. The error also occurs when running inference, and can also be caused by dimension mismatch. Can you please try checking that?

wybert commented 6 months ago

I got the same issue. To have the correct version of every package, it will work well.

mbzuai-oryx / GeoChat

CUDA error during inference #14