johnma2006 / mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
Apache License 2.0
2.62k stars 191 forks source link

Does mamba minimal has gpu op optimization #13

Closed pengzhangzhi closed 10 months ago

pengzhangzhi commented 10 months ago

Hi, thanks for the great work! I would like to know does mamba-minimal has GPU op optimization? It seems to me that it doesn't have. I want to train a large-scale mamba and am currently considering the original mamba and mamba-minimal.

pengzhangzhi commented 10 months ago

My concerns are mainly around the efficiency in training.

johnma2006 commented 10 months ago

In this case you should definitely be using the official implementation, as (1) it is heavily optimized, and (2) it has proper initialization.

On Tue, Jan 9, 2024 at 4:31 PM Zhangzhi Peng @.***> wrote:

My concerns are mainly around the efficiency in training.

— Reply to this email directly, view it on GitHub https://github.com/johnma2006/mamba-minimal/issues/13#issuecomment-1883829477, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACC5JWLXU27CRLJVYZXVQSLYNWZLZAVCNFSM6AAAAABBTYJ3WWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBTHAZDSNBXG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

pengzhangzhi commented 10 months ago

thanks!!!