Architecture Requests for Mamba

ml-explore / mlx-examples

Examples in the MLX framework

MIT License

6.29k stars 897 forks source link

Architecture Requests for Mamba #1030

Open hg0428 opened 1 month ago

hg0428 commented 1 month ago

I would like support the following architectures:

Mamba
MambaByte
Mamba-2
Mamba-hybrid (mamba + transformer)
Mamba-2-hybrid (mamba2 + transformer)

These architectures are becoming quite common now and are supported by most major LLM libraries.

awni commented 1 month ago

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

hg0428 commented 1 month ago

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Mamba: https://huggingface.co/tiiuae/falcon-mamba-7b Mamba-2: https://huggingface.co/state-spaces/mamba2-2.7b MambaByte: https://huggingface.co/JunxiongWang/MambaByte_Books Mamba-Hybrid: https://huggingface.co/Zyphra/Zamba-7B-v1 Mamba2-Hybrid: https://huggingface.co/Zyphra/Zamba2-2.7B-instruct

hg0428 commented 1 month ago

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Zamba2 7b was just released. One of the best models of its size, it outperforms Llama3.2 11b and Mistral 7b in almost every benchmark. It is a Mamba2-hybrid model. https://www.zyphra.com/post/zamba2-7b