OpenGVLab / VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
https://arxiv.org/abs/2403.06977
Apache License 2.0
839 stars 60 forks source link

Error "causal_conv1d_fwd(): incompatible function arguments" #95

Open mr17m opened 1 week ago

mr17m commented 1 week ago

Hello,

Thank you for your interesting work. In order to employ VideoMamba_middle_mask_16frame in my network as the encoder I faced the following error:


Exception has occurred: TypeError
causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor

Invoked with: tensor([[[ 1.0577, -0.4986, -0.2766,  ...,  0.8521, -0.4315,  1.2651],
         [ 0.4277,  2.8302,  2.1279,  ...,  0.2504,  0.7036,  0.5841],
         [ 3.1013,  0.3740,  0.8949,  ...,  0.4121,  0.3269,  0.7011],
         ...,
         [ 0.6695, -0.0142,  0.0830,  ...,  0.2503, -0.1194, -0.5641],
         [-0.9099,  0.5441, -0.1383,  ...,  1.2155,  1.7423, -0.1691],
         [-0.0655, -0.0671,  0.1619,  ..., -0.0608,  0.8674,  0.2476]]],
       device='cuda:0', requires_grad=True), tensor([[ 1.3809e-02, -3.8543e-03,  1.0760e-02, -4.4659e-01],
        [ 4.8837e-03,  8.3645e-03,  6.7215e-04,  3.2914e-02],
        [ 8.5364e-03,  5.1495e-03,  1.4346e-02,  8.4561e-02],
        ...,
        [ 3.0055e-03, -1.9203e-02, -1.4898e-02,  6.3762e-01],
        [ 2.6319e-03,  1.3055e-02,  6.9637e-03, -8.8667e-02],
        [ 5.4781e-03, -1.2733e-04, -5.6090e-03, -3.0454e-03]], device='cuda:0',
       requires_grad=True), Parameter containing:
tensor([-0.1571,  0.2025, -0.3328,  ..., -0.1991, -0.1689, -0.4190],
       device='cuda:0', requires_grad=True), True
  File "/home/user/VidMamba/video_mamba/mamba/mamba_ssm/ops/selective_scan_interface.py", line 177, in forward
    conv1d_out = causal_conv1d_cuda.causal_conv1d_fwd(x, conv1d_weight, conv1d_bias, True)
  File "/home/user/VidMamba/video_mamba/mamba/mamba_ssm/ops/selective_scan_interface.py", line 632, in mamba_inner_fn_no_out_proj
    return MambaInnerFnNoOutProj.apply(xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
  File "/home/user/VidMamba/video_mamba/mamba/mamba_ssm/modules/mamba_simple.py", line 185, in forward
    out = mamba_inner_fn_no_out_proj(
  File "/home/user/VidMamba/video_mamba/videomamba/video_sm/models/videomamba.py", line 94, in forward
    hidden_states = self.mixer(hidden_states, inference_params=inference_params)
  File "/home/user/VidMamba/video_mamba/videomamba/video_sm/models/videomamba.py", line 339, in forward_features
    hidden_states, residual = layer(
  File "/home/user/VidMamba/video_mamba/videomamba/video_sm/models/videomamba.py", line 366, in forward
    x = self.forward_features(x, inference_params)
  File "/home/user/VidMamba/SalFoM.py", line 76, in forward
    y  = self.encoder(x)
  File "/home/user/VidMamba/train_SalFoM.py", line 103, in train
    z0 = model(img_clips)
  File "/home/user/VidMamba/train_SalFoM.py", line 169, in <module>
    loss = train(model, optimizer, train_loader, epoch, device, args, writer)
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: bool) -> torch.Tensor

Invoked with: tensor([[[ 1.0577, -0.4986, -0.2766,  ...,  0.8521, -0.4315,  1.2651],
         [ 0.4277,  2.8302,  2.1279,  ...,  0.2504,  0.7036,  0.5841],
         [ 3.1013,  0.3740,  0.8949,  ...,  0.4121,  0.3269,  0.7011],
         ...,
         [ 0.6695, -0.0142,  0.0830,  ...,  0.2503, -0.1194, -0.5641],
         [-0.9099,  0.5441, -0.1383,  ...,  1.2155,  1.7423, -0.1691],
         [-0.0655, -0.0671,  0.1619,  ..., -0.0608,  0.8674,  0.2476]]],
       device='cuda:0', requires_grad=True), tensor([[ 1.3809e-02, -3.8543e-03,  1.0760e-02, -4.4659e-01],
        [ 4.8837e-03,  8.3645e-03,  6.7215e-04,  3.2914e-02],
        [ 8.5364e-03,  5.1495e-03,  1.4346e-02,  8.4561e-02],
        ...,
        [ 3.0055e-03, -1.9203e-02, -1.4898e-02,  6.3762e-01],
        [ 2.6319e-03,  1.3055e-02,  6.9637e-03, -8.8667e-02],
        [ 5.4781e-03, -1.2733e-04, -5.6090e-03, -3.0454e-03]], device='cuda:0',
       requires_grad=True), Parameter containing:
tensor([-0.1571,  0.2025, -0.3328,  ..., -0.1991, -0.1689, -0.4190],
       device='cuda:0', requires_grad=True), True

the error raised in the following line: conv1d_out = causal_conv1d_cuda.causal_conv1d_fwd(x, conv1d_weight, conv1d_bias, True)

and the type of "x" is <class 'torch.Tensor'>, the type of "conv1d_weight" is <class 'torch.Tensor'> and the type of "conv1d_bias" is <class 'torch.nn.parameter.Parameter'>

I have loaded the model video_mamba_middle _mask with 16 frames as below in my model:


from videomamba.video_sm.models.videomamba import videomamba_middle 

self.encoder = videomamba_middle(num_classes=400, num_frames=16)

The input tensor to my model has the shape of [b=1, C=3, T=16, H=224, W=224] and the type of the input to the encoder is <class 'torch.Tensor'>.

Please let me know how to fix the problem.

Thank you

Andy1621 commented 1 day ago

Sorry for the late response. How do you install the causal-conv-1d? Please install it via pip install -e causal-conv1d

mr17m commented 1 day ago

Thank you for your answer. I installed using pip install causal-conv1d but your solution solved my problem.