Is it possible to use selective scan in MLLA?

LeapLabTHU / MLLA

Official repository of MLLA (NeurIPS 2024)

179 stars 6 forks source link

Is it possible to use selective scan in MLLA? #4

Closed IceClear closed 3 months ago

IceClear commented 3 months ago

Hi, thanks for your wonderful work. I have a small question related to the implementation of MLLA. If my understanding is correct, MLLA can also be viewed as a kind of Mamba since they share a unified formulation. So I guess some efficient implementations used by Mamba should also work for MLLA? Specifically, the selective scan used in Mamba can largely improve the efficiency of Mamba, so it should also work for MLLA? Is it possible to support it in MLLA? Thanks and look forward to your reply.

tian-qing001 commented 3 months ago

Hi @IceClear, thanks for your insightful attention to our work. As analyzed in our paper, Mamba has to employ recurrent calculation which unavoidably reduces model throughput. To address this, Mamba proposes a hardware-aware algorithm to speed up such computation. In contrast, our MLLA preserves parallelizable computation, i.e. matrix multiplication, which naturally benefits from fast inference speed, eliminating the need to utilize Mamba's efficient implementations.

IceClear commented 3 months ago

Thanks for your quick and detailed response! So if my understanding is correct, MLLA is more like a linear transformer but improved by effective design inspired by mamba？So it is not actually following the autoregressive paradigm.

tian-qing001 commented 3 months ago

That's right. The equivalent forget gate endows Mamba with the autoregressive nature. We believe that such causal mode is not very suitable for vision tasks and replace the forget gate with proper positional encodings.

IceClear commented 3 months ago

Noted. Thanks for your reply！