Open songxujay opened 5 months ago
The size of headdim defaults to 64, which means that there needs to be at least 64 dimensions to create a single head. Just decrease the amount of dims in a head by passing headdim
or increase the dimensionality of your model.
Thanks all, but after these, I meet new issue ''' causal_conv1d_cuda.causal_conv1d_fwd(rearrange(xBC, "b s d -> b d s"), TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
Look at #257
Have you solved this problem? It doesn't happen when I use mamba, but when I use mamba2, TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
After decrease the headim
, I also encounter another issue. Here is the error message I received:
File /data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:761, in MambaSplitConv1dScanCombinedFn.forward(ctx, zxbcdt, conv1d_weight, conv1d_bias, dt_bias, A, D, chunk_size, initial_states, seq_idx, dt_limit, return_final_states, activation, rmsnorm_weight, rmsnorm_eps, outproj_weight, outproj_bias, headdim, ngroups, norm_before_gate)
[758](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:758) zx0, z, xBC, dt = torch.split(zxbcdt, [2 * d_nonssm, dim, dim + ngroups * dstate * 2, nheads], dim=-1)
[759](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:759) seq_idx = seq_idx.contiguous() if seq_idx is not None else None
[760](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:760) xBC_conv = rearrange(
--> [761](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:761) causal_conv1d_cuda.causal_conv1d_fwd(rearrange(xBC, "b s d -> b d s"),
[762](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:762) conv1d_weight, conv1d_bias, seq_idx, None, None, activation in ["silu", "swish"]),
[763](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:763) "b d s -> b s d"
[764](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:764) )
[765](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:765) x, B, C = torch.split(xBC_conv, [dim, ngroups * dstate, ngroups * dstate], dim=-1)
[766](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224f4c56492d31227d.vscode-resource.vscode-cdn.net/data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:766) x = rearrange(x, "b l (h p) -> b l h p", h=nheads)
AttributeError: 'NoneType' object has no attribute 'causal_conv1d_fwd'
I would try uninstalling everything and then rebuilding mamba, make sure that your gpu's compute capability is added to setup.py. Take a look at #257 .
Have you solved this problem? It doesn't happen when I use mamba, but when I use mamba2, TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
- (arg0: at::Tensor, arg1: at::Tensor, arg2: Optional[at::Tensor], arg3: bool) -> at::Tensor, my causal_conv1d version is 1.0.0, mamba-ssm version is 1.0.1al_conv1d version is 1.0.0, mamba-ssm version is 1.0.1.
Hi, I meet the same error when use mamba2. Have you solve the question?
Has anyone met this problem before? Thank you!
''' import torch
from mamba_ssm.modules import Mamba2
batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda")
model = Mamba2(
This module uses roughly 3 expand d_model^2 parameters
).to("cuda") y = model(x) assert y.shape == x.shape ''' ''' Traceback (most recent call last): File "/home/songxu/fujitsu/mamba/mamba/test_mamba2.py", line 8, in
model = Mamba2(
File "/home/songxu/fujitsu/mamba/mamba/mamba_ssm/modules/mamba2.py", line 77, in init
assert self.d_ssm % self.headdim == 0
AssertionError
'''