mamba_ssm has already hardcoded the filenames for the config and weights files in https://github.com/Wauplin/mamba/blob/33dc96c84926e58a392861d5ad9d2ee4f4f4a259/mamba_ssm/utils/hf.py. It uses the convention for the Hugging Face Hub which is usually config.json for config and model.safetensors for weights. This PR updates mistral-inference to comply with this and make the model loadable from mamba_ssm directly. It would require the filenames to be renamed on the HF Hub repo.
This change would also allow the download counter to work on the model page out of the box.
mamba_ssm
has already hardcoded the filenames for the config and weights files in https://github.com/Wauplin/mamba/blob/33dc96c84926e58a392861d5ad9d2ee4f4f4a259/mamba_ssm/utils/hf.py. It uses the convention for the Hugging Face Hub which is usuallyconfig.json
for config andmodel.safetensors
for weights. This PR updatesmistral-inference
to comply with this and make the model loadable frommamba_ssm
directly. It would require the filenames to be renamed on the HF Hub repo.This change would also allow the download counter to work on the model page out of the box.