go-skynet / go-ggml-transformers.cpp

Binding to transformers in ggml
MIT License
58 stars 11 forks source link

Unable to load GPTNeoX model Pythia-70m-q4_0.bin #39

Open chrisbward opened 1 year ago

chrisbward commented 1 year ago
root@5dac227a29e8:~# LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD /usr/local/go/bin/go run /root/go-ggml-transformers.cpp/examples/main.go  -m "/models/pythia-70m-q4_0.bin" -t 14
gpt2_model_load: loading model from '/models/pythia-70m-q4_0.bin'
gpt2_model_load: n_vocab = 50304
gpt2_model_load: n_ctx   = 2048
gpt2_model_load: n_embd  = 512
gpt2_model_load: n_head  = 8
gpt2_model_load: n_layer = 6
gpt2_model_load: ftype   = 16
gpt2_model_load: qntvr   = 0
gpt2_model_load: invalid model file '/models/pythia-70m-q4_0.bin' (bad vocab size 1 != 50304)
gpt2_bootstrap: failed to load model from '/models/pythia-70m-q4_0.bin'
Loading the model failed: failed loading model

I've changed the line in examples to l, err := gpt2.NewGPTNeoX(model) but no luck - what am I missing here?

chrisbward commented 1 year ago

Confirmed working fine when using ggml directly; https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox

Would like to get the bindings working!

(.venv) ➜  build git:(master) ✗ ./bin/gpt-neox -m /media/NAS/MLModels/02_LLMs/pythia-ggml/pythia-70m-q4_0.bin -p "I believe the meaning of life is" -t 8
main: seed = 1689235018
gpt_neox_model_load: loading model from '/media/NAS/MLModels/02_LLMs/pythia-ggml/pythia-70m-q4_0.bin' - please wait ...
gpt_neox_model_load: n_vocab = 50304
gpt_neox_model_load: n_ctx   = 2048
gpt_neox_model_load: n_embd  = 512
gpt_neox_model_load: n_head  = 8
gpt_neox_model_load: n_layer = 6
gpt_neox_model_load: n_rot   = 16
gpt_neox_model_load: par_res = 1
gpt_neox_model_load: ftype   = 2002
gpt_neox_model_load: qntvr   = 2
gpt_neox_model_load: ggml ctx size =  92.00 MB
gpt_neox_model_load: memory_size =    24.00 MB, n_mem = 12288
gpt_neox_model_load: ......... done
gpt_neox_model_load: model size =    37.91 MB / num tensors = 76
extract_tests_from_file : No test file found.
test_gpt_tokenizer : 0 tests failed out of 0 tests.
main: number of tokens in prompt = 7

I think the clue here lies within gpt2_model_load vs gpt_neox_model_load, but thought my change would be enough to use the right adapter?