hustvl / Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.55k stars 159 forks source link

how could the activation operator gate the forward and backward SSM? #62

Open abc5z7 opened 2 months ago

abc5z7 commented 2 months ago

i dont see any SiLU function which should be here.did i get the wrong file? anyone could explain it? thanks so much! in vim/models_mamba.py

    hidden_states_f, residual_f = self.layers[i * 2](
        hidden_states, residual, inference_params=inference_params
    )

    hidden_states_b, residual_b = self.layers[i * 2 + 1](
        hidden_states.flip([1]), None if residual == None else residual.flip([1]), inference_params=inference_params
    )
    hidden_states = hidden_states_f + hidden_states_b.flip([1])
    residual = residual_f + residual_b.flip([1])
SahilNawale commented 2 months ago

` self.layers = nn.ModuleList( [ create_block ... ] )

mixer_cls = partial(Mamba, layer_idx=layer_idx, bimamba_type=bimamba_type, if_devide_out=if_devide_out, init_layer_scale=init_layer_scale, ssm_cfg, factory_kwargs)

from mamba_ssm.modules.mamba_simple import Mamba

` import Mamba is a module that contains the SiLU implemented, you can find it over here : https://github.com/hustvl/Vim/blob/main/mamba-1p1p1/mamba_ssm/modules/mamba_simple.py

ZHUXIUJINChris commented 1 day ago

Thank you for the excellent work.

I have the same question. A single activation function controls both the forward and backward paths. How does directly importing Mamba achieve control of both paths? Could you point out the specific location in the code?

Thank you very much.