ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.64k stars 9.26k forks source link

Bug: Can't quantize in gguf q5_k_m a mamba architecture codestral #8690

Open Volko61 opened 1 month ago

Volko61 commented 1 month ago

What happened?

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: mamba-codestral-7B-v0.1\nTraceback (most recent call last):\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3673, in \n main()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3645, in main\n model_architecture = hparams["architectures"][0]\nKeyError: 'architectures'\n'

image

Name and Version

latest (hf space)

What operating system are you seeing the problem on?

Other? (Please let us know in description)

Relevant log output

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: mamba-codestral-7B-v0.1\nTraceback (most recent call last):\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3673, in \n main()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3645, in main\n model_architecture = hparams["architectures"][0]\nKeyError: 'architectures'\n'

![image](https://github.com/user-attachments/assets/6f1ab039-754d-4c6c-8bb8-274c0c99ca13)
NextGenOP commented 1 month ago

8519 related to this