Open ralyodio opened 1 year ago
here is the stack trace:
llama.cpp: loading model from /home/ettinger/models/ggml-model-q4_1.bin
llama_model_load_internal: format = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1024
llama_model_load_internal: n_embd = 6656
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 52
llama_model_load_internal: n_layer = 60
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 3 (mostly Q4_1)
llama_model_load_internal: n_ff = 17920
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 30B
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
[2023-05-16T10:47:19Z INFO llama_node_cpp::context] AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Error: Failed to convert napi value Function into rust type `f64`
at file:///home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/llm/llama-cpp.js:73:39
at new Promise (<anonymous>)
at LLamaCpp.<anonymous> (file:///home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/llm/llama-cpp.js:71:14)
at Generator.next (<anonymous>)
at file:///home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/llm/llama-cpp.js:33:61
at new Promise (<anonymous>)
at __async (file:///home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/llm/llama-cpp.js:17:10)
at LLamaCpp.createCompletion (file:///home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/llm/llama-cpp.js:67:12)
at LLM.<anonymous> (/home/ettinger/src/descriptive.chat/descriptive-web/node_modules/llama-node/dist/index.cjs:56:23)
at Generator.next (<anonymous>) {
code: 'NumberExpected'
I upgraded to latest version and this started happening. It work on 0.1.2. Official example no longer works either.
it looks like you are using GGJT v1 model and llama.cpp backend has ended the support of it. https://github.com/ggerganov/llama.cpp/pull/1305/files#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348efR921 https://github.com/ggerganov/llama.cpp/blob/master/llama.cpp#L932
ok, so how do i find models on huggingface.co that are actually supported?
do you know what the reasoning is for dropping support? I don't see many q8 models available.
ok, so how do i find models on huggingface.co that are actually supported?
do you know what the reasoning is for dropping support? I don't see many q8 models available.
@ralyodio its hard to identify the real version of a model. this is a huge trap for ggml eco system. q4_1 and q5_1 etc are just the quantization format and they are not used to label a version of a model... for llama.cpp the most latest models are mostly in q5_1. the reasons for this are performance improvement/cuda supports/mmap loading, etc.
Getting this error:
Error: Failed to convert napi value Function into rust type
f64``Model is:
WizardLM-7B-uncensored.ggml.q5_1.bin