Open zigzagcai opened 5 months ago
In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?
In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?
Thanks for your timely response! Sure. I am referring to the one in Megatron-LM. I am wondering does Mamba has built-in support for this kind of sequence parallel, or we need to implement it manually?
Nothing is built-in, but it'll be implemented in the future.
Got it. Thanks!
Hi,
I recently learnt about this selective SSM architecture, and it was awesome! But I have some questions. We know that the Transformer architecture supports sequence parallelism, so does Mamba (the potential alternative of Transformer) support sequence parallelism?