Closed mh4ckt3mh4ckt1c4s closed 1 year ago
Same for me with model Pi3141/alpaca-native-7B-ggml
Output from llama.cpp
llama.cpp: loading model from ./models/ggml-model-q5_1.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 68,20 KB
llama_model_load_internal: mem required = 6612,58 MB (+ 1026,00 MB per state)
llama_init_from_file: kv self size = 1024,00 MB
Output from pyllamacpp
[+] Running model `models/ggml-model-q5_1.bin`
[+] LLaMA context params: `{'n_ctx': 2048}`
[+] GPT params: `{}`
llama_model_load: loading model from 'models/ggml-model-q5_1.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx = 2048
llama_model_load: n_embd = 4096
llama_model_load: n_mult = 256
llama_model_load: n_head = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot = 128
llama_model_load: f16 = 9
llama_model_load: n_ff = 11008
llama_model_load: n_parts = 1
llama_model_load: type = 1
llama_model_load: invalid model file 'models/ggml-model-q5_1.bin' (bad f16 value 9)
llama_init_from_file: failed to load model
Segmentation fault
Original page have been archived, but links are still available here : https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-chat
Thanks @mh4ckt3mh4ckt1c4s for reporting the issue.
Maybe there are new updates on the llama.cpp
side .. I'll try to sync the repo once I get some time.
Hi guys, I pushed a new release v2.2.0
, could you please give it a try ?
I tested it with vicuna
and alpaca
and both seem to be working on my end ?
Hello, I tested with Vicuna and it works with 2.2.0 but not with the latest 2.3.0. Is that normal ?
Hello, I tested with Vicuna and it works with 2.2.0 but not with the latest 2.3.0. Is that normal ?
@mh4ckt3mh4ckt1c4s Yes it is normal as llama.cpp
recent changes broke older models, so you will need to re-quantize the old models to work with the new update.
Okay, so from my point of view this issue is closed. Thanks for your work !
When I try to load the vicuna models downloaded from this page, I have the following error :
I do not have this problem when using the gpt4all models. Running the vicuna models with the latest version of llama.cpp works just fine.