cocktailpeanut / dalai

The simplest way to run LLaMA on your local machine
https://cocktailpeanut.github.io/dalai
13.1k stars 1.42k forks source link

13B Alpaca seems to be available #97

Open fakana357 opened 1 year ago

fakana357 commented 1 year ago

13B Alpaca seems to be available at https://github.com/antimatter15/alpaca.cpp

Screenshot_2

abrahambone commented 1 year ago

I haven't been able to get the 13B alpaca model from this torrent working with Dalai, came here to see if anyone has yet. It works fine with alpaca.cpp.

I tried running "npx dalai alpaca install 13B", and it ran and downloaded, but the .bin model file it downloaded was the 7B file, not 13B.

If I put the 13B model file from the torrent in a dalai folder (.\dalai\alpaca\models\13B), it's not detected at all upon Dalai startup. If I rename the .bin model file to match the default name of the 7B model that is downloadable via Dalai, then it's recognized and selectable in the GUI dropdown box -- but it doesn't work. Give it a prompt, and nothing happens.

jwooldridge234 commented 1 year ago

When I run debug mode on the file (using the webui), I get this output (on a 16GB MacOS M1 Pro):

The default interactive shell is now zsh. To update your account to use zsh, please run chsh -s /bin/zsh. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ /Users/jackwooldridge/dalai/alpaca/main --seed -1 --threads 4 --n_predict 200 --model models/13B/ggml-model-q4_0.bin --top_k 40 --top_p 0.9 --temp 0.1 --repeat_last_n 64 --repeat_penalty 1.3 -p "Test" main: seed = 1679274097 llama_model_load: loading model from 'models/13B/ggml-model-q4_0.bin' - please wait ... llama_model_load: ggml ctx size = 10959.49 MB llama_model_load: memory_size = 3200.00 MB, n_mem = 81920 llama_model_load: loading model part 1/2 from 'models/13B/ggml-model-q4_0.bin' llama_model_load: llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file main: failed to load model from 'models/13B/ggml-model-q4_0.bin' bash-3.2$ exit exit

I found this closed issue in the ggerganov repo: https://github.com/ggerganov/llama.cpp/issues/24 I think the issue may be that the llama.cpp repo included in this project is out of date with the master, and the quantization has changed? But I don't know enough to be sure if I'm right on that.

jwooldridge234 commented 1 year ago

Anyone who's having this issue- figured out a (hacky) solution thanks to this issue report: https://github.com/antimatter15/alpaca.cpp/issues/45

Go into dalai/alpaca/main.cpp and on line 130 you should see "n_parts = LLAMA_N_PARTS.at(hparams.n_embd);" Comment that out and add "n_parts = 1;" instead. Then navigate to dalai/alpaca/ in the console and run "make" to build with the change.

It then runs perfectly in the webui for me- hopefully other people find this helpful. It looks like it expect the larger models to be multiple files.