Open synbol opened 3 weeks ago
self.attention = RWKV6Attention(hidden_size=config.dim, num_heads=config.n_head, layer_idx=layer_id)
Hi, can you provide some runnable code for reproduction. It works normally for me by running this
python benchmark_training_throughput.py --name rwkv6
Hi, thanks for reporting it. What is your GPU model?
Describe the bug
Error message: python: /project/lib/Analysis/Allocation.cpp:47: std::pair<llvm::SmallVector, llvm::SmallVector > mlir::triton::getCvtOrder(mlir::Attribute, mlir::Attribute): Assertion `!(srcMmaLayout && dstMmaLayout && !srcMmaLayout.isAmpere()) && "mma -> mma layout conversion is only supported on Ampere"' failed.
Steps to reproduce the bug
Calling process: from fla.layers.rwkv6 import RWKV6Attention self.attention = RWKV6Attention(hidden_size=config.dim, num_heads=config.n_head)
o, _, past_key_values = self.attention(self.attention_norm(x), attention_mask=mask, past_key_values=past_key_values)
Expected behavior
None
Environment info
Environment: torch 2.4.1 triton 3.0.0 einops 0.8.0