rhohndorf / Auto-Llama-cpp

Uses Auto-GPT with Llama.cpp
MIT License
384 stars 68 forks source link

Memory Error -- shapes (0,8192) and (5120,) not aligned: 8192 (dim 1) != 5120 (dim 0) #11

Closed jchen-sinai closed 1 year ago

jchen-sinai commented 1 year ago

After thinking, I got the following error (on a Ubuntu 22.04 VM):

Using memory of type: LocalCache | Thinking... llama_print_timings: load time = 629.49 ms llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: prompt eval time = 629.34 ms / 2 tokens ( 314.67 ms per token) llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: total time = 629.68 ms Traceback (most recent call last): File "/data/Auto-Llama-cpp/scripts/main.py", line 331, in assistant_reply = chat.chat_with_ai( File "/data/Auto-Llama-cpp/scripts/chat.py", line 77, in chat_with_ai relevant_memory = permanent_memory.get_relevant(str(full_message_history[-5:]), 10) File "/data/Auto-Llama-cpp/scripts/memory/local.py", line 105, in get_relevant scores = np.dot(self.data.embeddings, embedding) File "<__array_function__ internals>", line 5, in dot ValueError: shapes (0,8192) and (5120,) not aligned: 8192 (dim 1) != 5120 (dim 0)

InfernalDread commented 1 year ago

you have to edit the 34th line on the .env file: EMBED_DIM = 8192, to the EMB number that your model uses, which in this case is 5120

jchen-sinai commented 1 year ago

Thank you! Yes it works. I did not read the .env carefully :-(

InfernalDread commented 1 year ago

Thank you! Yes it works. I did not read the .env carefully :-(

No problemo. Glad to be of assistance!

thomasfifer commented 1 year ago

is there any reason this value has to be in the .env file? do we not already "know" the value before we need to use it? llama cpp prints it to the terminal when loading the model (n_embd) so it's at least stored somewhere. do we have access to this value from llama cpp where we could grab it after loading the model but before initializing any arrays of size EMBED_DIM? doing so might be a better way to solve this issue so others don't run into this problem in the future

jchen-sinai commented 1 year ago

is there any reason this value has to be in the .env file? do we not already "know" the value before we need to use it? llama cpp prints it to the terminal when loading the model (n_embd) so it's at least stored somewhere. do we have access to this value from llama cpp where we could grab it after loading the model but before initializing any arrays of size EMBED_DIM? doing so might be a better way to solve this issue so others don't run into this problem in the future

Agree. This may also open a door to use different n_embd models, such as vicuna-13b (5120) and vicuna-7b (4096), one for SMART, and the other one for FAST. Thanks.