Failing to load older ggml models

LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

https://github.com/lostruins/koboldcpp

GNU Affero General Public License v3.0

4.66k stars 334 forks source link

Failing to load older ggml models #88

Closed daniandtheweb closed 1 year ago

daniandtheweb commented 1 year ago

On the latest git version on Linux the program refuses to load older ggml models like vicuna or gpt-x-alpaca. The program is linked to OpenBLAS.

Loading model: /home/daniandtheweb/Applications/chat/ggml/ggml-vicuna-13b-4bit-rev1.bin [Parts: 1, Threads: 8, SmartContext: True]

Identified as LLAMA model: (ver 3) Attempting to Load...

System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | llama.cpp: loading model from /home/daniandtheweb/Applications/chat/koboldcpp/ error loading model: read error: Is a directory llama_init_from_file: failed to load model llama_load_model: error: failed to load model '/home/daniandtheweb/Applications/chat/koboldcpp/' Load Model OK: False Could not load model: /home/daniandtheweb/Applications/chat/ggml/ggml-vicuna-13b-4bit-rev1.bin

LostRuins commented 1 year ago

I believe it might be getting your model path wrong. The output text shows the path is /home/daniandtheweb/Applications/chat/koboldcpp/ but your model is located at /home/daniandtheweb/Applications/chat/ggml/ggml-vicuna-13b-4bit-rev1.bin and then its shows read error

What arguments did you use to launch the program, and what working directory?

daniandtheweb commented 1 year ago

I'm using the directory /home/daniandtheweb/Applications/chat/koboldcpp/ to run the program and I'm trying to load the models either from the model folder inside the koboldcpp folder and from that other external folder but in both cases I'm getting the same error. I'm using --threads 8 --smartcontext as arguments.

daniandtheweb commented 1 year ago

I tried to select the models from the popup window that appears when launching the program and also specifying directly the model path but it's still the same error. The pygmalion model and the new rwkv model load and function correctly. The error just shows on gpt-x-alpaca and vicuna which both worked perfectly until yesterday but i can't say exactly which commit introduced the issue.

LostRuins commented 1 year ago

Damn, must be a regression. I don't have vicuna, but I have tried gpt-x-alpaca 13B (Pi1314 q4_1 version) and it works fine for me.

2 things to try:

Can you try loading that model with the official Llama.cpp main.exe from the parent repo? You can get it here: https://github.com/ggerganov/llama.cpp/releases If theirs is also broken, might be good to submit an issue there as well.
Can you try loading the model with the --nommap flag?

daniandtheweb commented 1 year ago

The official llama.cpp works as expected and the --nommap doesn't seem to affect the issue. I'm using this version of gpt4-x-alpaca anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g

LostRuins commented 1 year ago

Alright I will see if I can repro this issue. Meanwhile, if anyone else encounters similar issues please do inform.

LostRuins commented 1 year ago

@DaniAndTheWeb can I confirm what OS you are using? And also confirm that v1.10 fails but v1.9 works?

daniandtheweb commented 1 year ago

I'm using Arch Linux. v1.9 works without any problem, v1.10 fails at loading those models.

daniandtheweb commented 1 year ago

@LostRuins I've been checking through the past commits and strangely after coming back to the latest commit and a rebuild of the program it now doesn't throw the error anymore. It seems like the build process had some strange error locally on my pc.

LostRuins commented 1 year ago

Hm maybe something went wrong with your build process... anyway I have not checked that model but I'll take your word for it - if its confirmed resolved on the latest version you can close this issue.