comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
50.45k stars 5.3k forks source link

CUDA error on ZLUDA: CUBLAS_STATUS_NOT_SUPPORTED when calling 'cublasSgemm()' #4132

Open avachon100510 opened 1 month ago

avachon100510 commented 1 month ago

Expected Behavior

Normally, when I use CUDA on ZLUDA, the prompt should be executed: I am using an AMD Radeon Vega 8 Graphics GPU with the AMD Ryzen 5 3500U CPU. It should happen normally... if it weren't for...

Actual Behavior

...this. FETCH DATA from: C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [DONE] got prompt model_type EPS Using split attention in VAE Using split attention in VAE loaded straight to GPU Requested to load BaseModel Loading 1 new model Requested to load SD1ClipModel Loading 1 new model !!! Exception during processing!!! CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when callingcublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) Traceback (most recent call last): File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "C:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 58, in encode output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 115, in encode_from_tokens o = self.cond_stage_model.encode_token_weights(tokens) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 567, in encode_token_weights out = getattr(self, self.clip).encode_token_weights(token_weight_pairs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 41, in encode_token_weights o = self.encode(to_encode) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 228, in encode return self(tokens) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 200, in forward outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 134, in forward x = self.text_model(*args, **kwargs) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 109, in forward x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 68, in forward x = l(x, mask, optimized_attention) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 49, in forward x += self.self_attn(self.layer_norm1(x), mask, optimized_attention) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 16, in forward q = self.q_proj(x) File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 50, in forward return self.forward_comfy_cast_weights(*args, **kwargs) File "C:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 46, in forward_comfy_cast_weights return torch.nn.functional.linear(input, weight, bias) RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when callingcublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)``

This is a CUDA error, indicating that the cuBLAS status is not supported, so why is this happening?

I am using Python 3.10.11, with PyTorch 2.0.0+cu118 and ZLUDA. And yes, I did apply the --disable-all-custom-nodes flag, to no avail.

Steps to Reproduce

It is heavily assumed that this issue is on my end only, but here is how it happened: First, select a model, enter the prompts, do some tweaks on the settings, and click on 'Queue Prompt'. Wait for a few seconds, and the error occurs.

Debug Logs

FETCH DATA from: C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [DONE]
got prompt
model_type EPS
Using split attention in VAE
Using split attention in VAE
loaded straight to GPU
Requested to load BaseModel
Loading 1 new model
Requested to load SD1ClipModel
Loading 1 new model
!!! Exception during processing!!! CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
Traceback (most recent call last):
  File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "C:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 58, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 115, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 567, in encode_token_weights
    out = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 228, in encode
    return self(tokens)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 200, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 134, in forward
    x = self.text_model(*args, **kwargs)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 109, in forward
    x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 68, in forward
    x = l(x, mask, optimized_attention)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 49, in forward
    x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\clip_model.py", line 16, in forward
    q = self.q_proj(x)
  File "C:\Users\taren\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 50, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "C:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 46, in forward_comfy_cast_weights
    return torch.nn.functional.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

Other

No response

robinjhuang commented 1 month ago

Which version of CUDA do you have?

import torch
print(torch.version.cuda)
robinjhuang commented 1 month ago

Okay nvm, you said you are using Cuda 11.8. Did you change anything about your set up recently?

Try reinstalling pytorch. If you want to use pytorch 2.0.0, try this:

pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

Otherwise, try upgrading your pytorch to the latest stable.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
yoopyman commented 1 month ago

Try to set the env variable: DISABLE_ADDMM_CUDA_LT=1

Tanglinling commented 1 month ago

这两天我尝试了不同作者的“segment anything”节点,但是无一列外在复杂一点的工作流中一定会出现“torch.cuda.OutOfMemoryError: Allocation on device”报错,如果只是单独使用这类节点很多时候又是正常的。不知道我遇到的问题是不是和这个错误类似的。 Allocation on device

File "D:\ComfyUI-aki-v1.3\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "D:\ComfyUI-aki-v1.3\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "D:\ComfyUI-aki-v1.3\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) File "D:\ComfyUI-aki-v1.3\custom_nodes\comfyui_segment_anything\node.py", line 317, in main boxes = groundingdino_predict( File "D:\ComfyUI-aki-v1.3\custom_nodes\comfyui_segment_anything\node.py", line 182, in groundingdino_predict boxes_filt = get_grounding_output( File "D:\ComfyUI-aki-v1.3\custom_nodes\comfyui_segment_anything\node.py", line 170, in get_grounding_output outputs = model(image[None], captions=[caption]) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\groundingdino.py", line 303, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer( File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\transformer.py", line 258, in forward memory, memory_text = self.encoder( File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\transformer.py", line 576, in forward output = checkpoint.checkpoint( File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(*args, *kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch_dynamo\eval_frame.py", line 451, in _fn return fn(args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch_dynamo\external_utils.py", line 36, in inner return fn(*args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\utils\checkpoint.py", line 487, in checkpoint return CheckpointFunction.apply(function, preserve, args) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\autograd\function.py", line 598, in apply return super().apply(args, kwargs) # type: ignore[misc] File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\utils\checkpoint.py", line 262, in forward outputs = run_function(args) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\transformer.py", line 785, in forward src2 = self.self_attn( File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\ms_deform_attn.py", line 271, in forward output = multi_scale_deformable_attn_pytorch( File "D:\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_LayerStyle\py\local_groundingdino\models\GroundingDINO\ms_deform_attn.py", line 70, in multi_scale_deformable_attn_pytorch (torch.stack(sampling_value_list, dim=-2).flatten(-2) attention_weights)