state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
12.6k stars 1.06k forks source link

Possible bug when running evaluation with self.use_mem_eff_path=False #451

Open Sawyer117 opened 2 months ago

Sawyer117 commented 2 months ago

Hi when I try to do evaluation as below:

python lm_harness_eval.py --model mamba_ssm --model_args pretrained=state-spaces/mamba2-130m --tasks boolq,piqa,hellaswag,winogrande,arc_easy,arc_challenge,openbookqa,race,truthfulqa_mc2 --device cuda --batch_size 64

by default it will use mem_eff_path and it runs totally fine, but if I manually set use mem_eff_path=False, below error showed up:

out = causal_conv1d_cuda.causal_conv1d_fwd(x, weight, bias, seq_idx, ctx.activation) TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:

  1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor

Invoked with: tensor([[[ 0.1909, -0.3875, -0.4731, ..., 0.0366, 0.2490, -0.0460], [-0.2214, -0.1570, -0.5879, ..., -0.2810, -0.1702, 0.0384], [ 0.1755, -0.2076, 0.7930, ..., -0.4204, -0.0614, 0.2781],

image

image it seems the error came from causal_conv1d library, so I continued to set causal_conv1d_fn=None to not to causal_conv1d and use pytorch conv1d

xBC = self.act(
                    self.conv1d(xBC.transpose(1, 2)).transpose(1, 2)
                )  # (B, L, self.d_ssm + 2 * ngroups * d_state)

but still I got this error: image

dragonBrother1 commented 1 month ago

hello,i encounted some questions, so i want to ask for you: 微信图片_20240809181619