ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.81k stars 9.29k forks source link

[User] How to convert Stability 3B model to ggml/ggul #3456

Closed zbruceli closed 10 months ago

zbruceli commented 11 months ago

Stability released their latest 3B model, but there is an error when executing the convert.py script:

// Model card and files on NF

https://huggingface.co/stabilityai/stablelm-3b-4e1t

//Error message

% python3 convert.py models/stablelm-3b-4e1t Traceback (most recent call last): File "/Users/xxx/code/llama.cpp/convert.py", line 1208, in main() File "/Users/xxx/code/llama.cpp/convert.py", line 1149, in main model_plus = load_some_model(args.model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/code/llama.cpp/convert.py", line 1060, in load_some_model raise Exception(f"Can't find model in directory {path}") Exception: Can't find model in directory models/stablelm-3b-4e1t

staviq commented 11 months ago

Model card mentions gptneox, try with this: convert-gptneox-hf-to-gguf.py

zbruceli commented 11 months ago

Thanks for the suggestion. I tried convert-gptneox-hf-to-gguf.py and got a new error

 % python3 convert-gptneox-hf-to-gguf.py models/stablelm-3b-4e1t
Traceback (most recent call last):
  File "/Users/xxx/code/llama.cpp/convert-gptneox-hf-to-gguf.py", line 16, in <module>
    from transformers import AutoTokenizer  # type: ignore[import]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'transformers'
staviq commented 11 months ago

You don't have python transformers installed

zbruceli commented 11 months ago

Sorry I copy pasted the wrong error message. Here is the relevant one:

python3 convert-gptneox-hf-to-gguf.py ./models/stablelm-3b-4e1t 1
gguf: loading model stablelm-3b-4e1t
Model architecture not supported: StableLMEpochForCausalLM
KonstantineGoudz commented 11 months ago

Any luck with this?

Green-Sky commented 11 months ago

This is a very cool model (4e1t -> 4t tokens) as of right now i dont think we support gptneox.

rodas-j commented 11 months ago

waiting for support here as well.

Galunid commented 11 months ago

I'm looking into this

rozek commented 10 months ago

it might have been useful to add a note about how to convert the original model to GGUF. Here is what I did (starting with a fresh Docker container based on python:3.9.18-slim-bookworm):

apt-get update
apt-get install git -y
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
pip install torch transformers
python convert-hf-to-gguf.py /stablelm

where /stablelm was the (mounted) folder containing the model files from Hugging Face

Green-Sky commented 10 months ago

@rozek yes, torch and transformers are only needed for the convert-hf-to-gguf.py

rozek commented 9 months ago

after a written permission from Stability AI I'm currently uploading the most relevant quantizations to Hugging Face for others to download directly

bachittle commented 9 months ago

@rozek Thank you for these quantizations! Tried it out, works well on my desktop. Could you also add a q3_k_m model? I am trying to run it on a device with only 4gb ram. That would be much appreciated.

rozek commented 9 months ago

done!