Open otoTree opened 3 weeks ago
Thank you for reaching out.
Please first check the dtype
and shape of q, k, v
, and ensure triton>=2.3.0
and torch>=2.3.0
is installed.
same problem
@windring what is your platform, torch version and triton version?
@jason-huang03 torch, and I found that my model is fp32
Hi can you try the latest code?
On Tue, Nov 12, 2024 at 6:56 PM gomoku @.***> wrote:
@jason-huang03 https://github.com/jason-huang03 I'm sorry to bother you again. When I fix dtype, it told me PassManager::run failed again
.conda/lib/python3.10/site-packages/sageattention/attn_qk_int8_per_block_hd64_causal.py":98:63)): error: mismatching kWidth between A and B operands
image.png (view on web) https://github.com/user-attachments/assets/c4163914-e8db-41f2-a7e5-2a055a850212 image.png (view on web) https://github.com/user-attachments/assets/d9bbe52e-82cc-411a-a8c5-c7f46cbcd885
— Reply to this email directly, view it on GitHub https://github.com/thu-ml/SageAttention/issues/14#issuecomment-2470219277, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2JDQMP5F3XLR46JDQRFQKT2AHNHRAVCNFSM6AAAAABQJG36CSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQGIYTSMRXG4 . You are receiving this because you were mentioned.Message ID: @.***>
错误: Traceback (most recent call last): File "/root/musc/examples/musc_main_clone.py", line 90, in
model.main()
File "/root/musc/models/musc_clone.py", line 237, in main
self.make_category_data(category="category", )
File "/root/musc/models/musc_clone.py", line 142, in make_category_data
patch_tokens = self.dino_model.get_intermediate_layers(x=input_image,
File "/root/musc/./models/backbone/dinov2/models/vision_transformer.py", line 311, in get_intermediate_layers
outputs = self._get_intermediate_layers_not_chunked(x, n)
File "/root/musc/./models/backbone/dinov2/models/vision_transformer.py", line 280, in _get_intermediate_layers_not_chunked
x = blk(x)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, *kwargs)
File "/root/musc/./models/backbone/dinov2/layers/block.py", line 254, in forward
return super().forward(x_or_x_list)
File "/root/musc/./models/backbone/dinov2/layers/block.py", line 112, in forward
x = x + attn_residual_func(x)
File "/root/musc/./models/backbone/dinov2/layers/block.py", line 91, in attn_residual_func
return self.ls1(self.attn(self.norm1(x)))
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/musc/./models/backbone/dinov2/layers/attention.py", line 79, in forward
return super().forward(x)
File "/root/musc/./models/backbone/dinov2/layers/attention.py", line 63, in forward
attn = sageattn(q, k, v, is_causal=False, smooth_k=True)
File "/root/miniconda3/lib/python3.10/site-packages/sageattention/core.py", line 45, in sageattn
o = attn_h64_false(q_int8, k_int8, v, q_scale, k_scale)
File "/root/miniconda3/lib/python3.10/site-packages/sageattention/attn_qk_int8_per_block_h64.py", line 97, in forward
_attn_fwd[grid](
File "", line 63, in _attn_fwd
File "/root/miniconda3/lib/python3.10/site-packages/triton/compiler/compiler.py", line 476, in compile
next_module = compile_kernel(module)
File "/root/miniconda3/lib/python3.10/site-packages/triton/compiler/compiler.py", line 383, in
lambda src: optimize_ttgir(ttir_to_ttgir(src, num_warps), num_stages, arch))
File "/root/miniconda3/lib/python3.10/site-packages/triton/compiler/compiler.py", line 91, in optimize_ttgir
pm.run(mod)
RuntimeError: PassManager::run failed
显卡:3090 root@autodl-container-12f34dabc5-6cbd4932:~/musc# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0