Closed mapmeld closed 1 month ago
This PR adds support for Mamba models (by default mamba-2.8b) which are an alternative to attention-based LLMs.
Changes include:
models/
engines/
MambaForCausalLM
Notes:
Demonstrated inference in this notebook: https://colab.research.google.com/drive/1-i4xmsyppWBdwR1qt6QN21m0uiXu0guM?usp=sharing
model = BaseModel.create("mamba")
model.generate(...)
@mapmeld Thanks for your contribution!
Summary
This PR adds support for Mamba models (by default mamba-2.8b) which are an alternative to attention-based LLMs.
Changes include:
models/
directory, MambaEngine to theengines/
directory, and updates to config filesMambaForCausalLM
Notes:
models/
directoryChecklist
Demonstrated inference in this notebook: https://colab.research.google.com/drive/1-i4xmsyppWBdwR1qt6QN21m0uiXu0guM?usp=sharing
model = BaseModel.create("mamba")
model.generate(...)