Closed Fly-Pluche closed 1 month ago
Codestral Mamba is based on the mamba architecture, and not the transformers architecture, you will have to use mistral_inference.mamba
and not mistral_inference.transformer
, you can take a look at : https://github.com/mistralai/mistral-inference/blob/main/src/mistral_inference/mamba.py and at the README file!
Thanks
Python -VV
Pip Freeze
Reproduction Steps
Expected Behavior
config.json is error.
Additional Context
No response
Suggested Solutions
No response