IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
https://arxiv.org/abs/2401.14159
Apache License 2.0
14.85k stars 1.37k forks source link

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasLtMatmul( ltHandle, computeDesc.descriptor(), &alpha_val, mat1_ptr, Adesc.descriptor(), mat2_ptr, Bdesc.descriptor(), &beta_val, result_ptr, Cdesc.descriptor(), result_ptr, Cdesc.descriptor(), &heuristicResult.algo, workspace.data_ptr(), workspaceSize, at::cuda::getCurrentCUDAStream()) #341

Closed jo-dean closed 1 year ago

jo-dean commented 1 year ago

when running grounding_dino_demo.py

2023-07-20 17-20-09屏幕截图

cuda:0 is ok. change to cuda:1 ### the error happens: Traceback (most recent call last): File "grounding_dino_demo.py", line 20, in <module> boxes, logits, phrases = predict( File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/util/inference.py", line 67, in predict outputs = model(image[None], captions=[caption]) File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/groundingdino.py", line 313, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer( File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/transformer.py", line 258, in forward memory, memory_text = self.encoder( File "/home/anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/transformer.py", line 576, in forward output = checkpoint.checkpoint( File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 107, in forward outputs = run_function(*args) File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/transformer.py", line 785, in forward src2 = self.self_attn( File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//AI/Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 354, in forward output = self.output_proj(output) File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home//anaconda3/envs/sam_env/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when callingcublasLtMatmul( ltHandle, computeDesc.descriptor(), &alpha_val, mat1_ptr, Adesc.descriptor(), mat2_ptr, Bdesc.descriptor(), &beta_val, result_ptr, Cdesc.descriptor(), result_ptr, Cdesc.descriptor(), &heuristicResult.algo, workspace.data_ptr(), workspaceSize, at::cuda::getCurrentCUDAStream())`

rentainhe commented 1 year ago

Hello, this is the bug of Deformable-Attention operator, it can not be set so specific GPU device using cuda:1, you should try to use CUDA_VISIBLE_DEVICES instead of it

jo-dean commented 1 year ago

thanks,it's ok now.

rentainhe commented 1 year ago

thanks,it's ok now.

You're welcome, this is something wrong with the custom operator and maybe you can contact the author to fix it.

I'm going to close this issue now, feel free to reopen it if necessary