Closed jchen-sinai closed 1 year ago
you have to edit the 34th line on the .env file: EMBED_DIM = 8192, to the EMB number that your model uses, which in this case is 5120
Thank you! Yes it works. I did not read the .env carefully :-(
Thank you! Yes it works. I did not read the .env carefully :-(
No problemo. Glad to be of assistance!
is there any reason this value has to be in the .env file? do we not already "know" the value before we need to use it? llama cpp prints it to the terminal when loading the model (n_embd) so it's at least stored somewhere. do we have access to this value from llama cpp where we could grab it after loading the model but before initializing any arrays of size EMBED_DIM? doing so might be a better way to solve this issue so others don't run into this problem in the future
is there any reason this value has to be in the .env file? do we not already "know" the value before we need to use it? llama cpp prints it to the terminal when loading the model (n_embd) so it's at least stored somewhere. do we have access to this value from llama cpp where we could grab it after loading the model but before initializing any arrays of size EMBED_DIM? doing so might be a better way to solve this issue so others don't run into this problem in the future
Agree. This may also open a door to use different n_embd models, such as vicuna-13b (5120) and vicuna-7b (4096), one for SMART, and the other one for FAST. Thanks.
After thinking, I got the following error (on a Ubuntu 22.04 VM):
Using memory of type: LocalCache | Thinking... llama_print_timings: load time = 629.49 ms llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: prompt eval time = 629.34 ms / 2 tokens ( 314.67 ms per token) llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: total time = 629.68 ms Traceback (most recent call last): File "/data/Auto-Llama-cpp/scripts/main.py", line 331, in
assistant_reply = chat.chat_with_ai(
File "/data/Auto-Llama-cpp/scripts/chat.py", line 77, in chat_with_ai
relevant_memory = permanent_memory.get_relevant(str(full_message_history[-5:]), 10)
File "/data/Auto-Llama-cpp/scripts/memory/local.py", line 105, in get_relevant
scores = np.dot(self.data.embeddings, embedding)
File "<__array_function__ internals>", line 5, in dot
ValueError: shapes (0,8192) and (5120,) not aligned: 8192 (dim 1) != 5120 (dim 0)