cocktailpeanut / dalai

The simplest way to run LLaMA on your local machine
https://cocktailpeanut.github.io/dalai
13.1k stars 1.42k forks source link

Low CPU, Low Memory, Low GPU usage via Docker #452

Open ugmqu opened 1 year ago

ugmqu commented 1 year ago

Hi,

I installed the project via docker.

An simnple prompt does not compute after an hour. Tried Alpaca 7B and 13B on both Windows and Ubuntu via Docker. I have the assumption the project runs on super low resources for some reason.

"docker stats" gives me the following output with the container running and an active prompt beeing processed:

NAME            CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O   PIDS
dalai-dalai-1   0.16%     70.48MiB / 7.701GiB   0.89%     25.4kB / 18.3kB   0B / 0B     23
nationallokmatparty commented 1 year ago

Can yu confirm that project via docker is working? if yes what step you take to solve that issue

nyck33 commented 1 year ago

image

Same for me, just tried for the first time and it seems super slow.

NickHatBoecker commented 1 year ago

Same here on my MacBook Pro (Apple M1 Pro) with Mac OS Monterey (12.6.3).

I selected "ai-dialog" template from web ui and clicked "Go". Now it's running for hours.

With debug on:

main: seed = 1687792550
llama_model_load: loading model from 'models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: invalid model file 'models/7B/ggml-model-q4_0.bin' (bad magic)
main: failed to load model from 'models/7B/ggml-model-q4_0.bin'
root@506e82aeadad:~/dalai/alpaca# exit
exit