state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
11.92k stars 994 forks source link

Question about support for sequence parallel #176

Open zigzagcai opened 5 months ago

zigzagcai commented 5 months ago

Hi,

I recently learnt about this selective SSM architecture, and it was awesome! But I have some questions. We know that the Transformer architecture supports sequence parallelism, so does Mamba (the potential alternative of Transformer) support sequence parallelism?

tridao commented 5 months ago

In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?

zigzagcai commented 5 months ago

In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?

Thanks for your timely response! Sure. I am referring to the one in Megatron-LM. I am wondering does Mamba has built-in support for this kind of sequence parallel, or we need to implement it manually?

tridao commented 5 months ago

Nothing is built-in, but it'll be implemented in the future.

zigzagcai commented 5 months ago

Got it. Thanks!