vicuna-tools / vicuna-installation-guide

The "vicuna-installation-guide" provides step-by-step instructions for installing and configuring Vicuna 13 and 7B
285 stars 34 forks source link

Error loading model: is this really a GGML file? #5

Closed kilkujadek closed 1 year ago

kilkujadek commented 1 year ago

Hello,

Using One-line install seems to by successful (except few warnings):


git clone https://github.com/fredi-python/llama.cpp.git && cd llama.cpp && make -j && cd models && wget -c https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Cloning into 'llama.cpp'...
remote: Enumerating objects: 2390, done.
remote: Counting objects: 100% (867/867), done.
remote: Compressing objects: 100% (77/77), done.
remote: Total 2390 (delta 815), reused 790 (delta 790), pack-reused 1523
Receiving objects: 100% (2390/2390), 2.16 MiB | 3.93 MiB/s, done.
Resolving deltas: 100% (1566/1566), done.
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:  
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

cc  -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native   -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c llama.cpp -o llama.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c examples/common.cpp -o common.o
llama.cpp: In function 'size_t llama_set_state_data(llama_context*, const uint8_t*)':
llama.cpp:2615:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
 2615 |             kin3d->data = (void *) in;
      |                           ^~~~~~~~~~~
llama.cpp:2619:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
 2619 |             vin3d->data = (void *) in;
      |                           ^~~~~~~~~~~
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native pocs/vdot/vdot.cpp ggml.o -o vdot 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o -o main 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize/quantize.cpp ggml.o llama.o -o quantize 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize-stats/quantize-stats.cpp ggml.o llama.o -o quantize-stats 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/perplexity/perplexity.cpp ggml.o llama.o common.o -o perplexity 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/embedding/embedding.cpp ggml.o llama.o common.o -o embedding 

====  Run ./main -h for help.  ====

--2023-05-15 09:57:30--  https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Resolving huggingface.co (huggingface.co)... 108.138.51.20, 108.138.51.95, 108.138.51.49, ...
Connecting to huggingface.co (huggingface.co)|108.138.51.20|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb%7EAPwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze%7Es7n0hQjX%7EBKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2023-05-15 09:57:31--  https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8''ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb~APwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze~s7n0hQjX~BKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 18.244.102.114, 18.244.102.76, 18.244.102.9, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|18.244.102.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9763701888 (9.1G) [application/octet-stream]
Saving to: 'ggml-vic13b-uncensored-q5_1.bin'

ggml-vic13b-uncensored-q5_1.bin                  100%[=========================================================================================================>]   9.09G  2.99MB/s    in 49m 46s 

2023-05-15 10:47:17 (3.12 MB/s) - 'ggml-vic13b-uncensored-q5_1.bin' saved [9763701888/9763701888]

But when I try to run it, it throwing an error:



./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36

main: build = 523 (0737a47)
main: seed  = 1684152947
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model```
fredi-python commented 1 year ago

navigate to your llama.cpp folder and type: git pull

kilkujadek commented 1 year ago

navigate to your llama.cpp folder and type: git pull

Nope, still same:


(lama2) kilku@debian:~/vicuna/llama.cpp$ git pull
Already up to date.
(lama2) kilku@debian:~/vicuna/llama.cpp$ ./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36
main: build = 523 (0737a47)
main: seed  = 1684157106
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model
fredi-python commented 1 year ago

try to run make -j again

kilkujadek commented 1 year ago

make -j

did that as well

r3t4k3r commented 1 year ago

i have the same error, OS Linux Mint image

ziliangpeng commented 1 year ago

same error. just freshly cloned the repo, freshly built, freshly downloaded the 7b model, and see the same error

andreibondarev commented 1 year ago

Same error with the 13b model.

fredi-python commented 1 year ago

OK, i just updated my fork of llama.cpp, should work now! navigate to the llama.cpp folder and type

git pull

then:

make -j

Should work!

kilkujadek commented 1 year ago

It is working now, thanks!