mistralai / mistral-inference

Official inference library for Mistral models
https://mistral.ai/
Apache License 2.0
9.64k stars 850 forks source link

Update hardcoded Mamba filenames #191

Closed Wauplin closed 3 months ago

Wauplin commented 3 months ago

mamba_ssm has already hardcoded the filenames for the config and weights files in https://github.com/Wauplin/mamba/blob/33dc96c84926e58a392861d5ad9d2ee4f4f4a259/mamba_ssm/utils/hf.py. It uses the convention for the Hugging Face Hub which is usually config.json for config and model.safetensors for weights. This PR updates mistral-inference to comply with this and make the model loadable from mamba_ssm directly. It would require the filenames to be renamed on the HF Hub repo.

This change would also allow the download counter to work on the model page out of the box.

patrickvonplaten commented 3 months ago

We want to keep the same file name across models for consistency and to not break the official mistral download