Open obito opened 1 year ago
@obito Did you just post your openai key here?
@obito Did you just post your openai key here?
It has been deleted for a very long time.
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
The error you are getting is due to a lack of memory in the context. This can happen if the model is too big or if there is not enough memory available on your machine. To fix this, you can try reducing the batch size or increasing the amount of memory available on your machine.
To reduce the batch size, you can modify the max_input_size
parameter in the prompt helper to a smaller value. For example, if you set it to max_input_size=200
, it will only process 200 tokens at a time instead of the default 400.
To increase the amount of memory available on your machine, you can try increasing the value of CONTEXT_SIZE
in the Llama configuration file or running with more threads (THREADS
). However, this may also require upgrading your hardware to handle the increased memory usage.
Sources:
LocalAI version: Latest
Environment, CPU architecture, OS, and Version: Darwin macbook-2.local 22.4.0 Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:41 PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T8103 arm64 (M1 Macbook Air)
Describe the bug Trying to use Llama_Index to construct a vector index of a document (pdf in my case), to use it as a query engine. I'm getting the error from LocalAI:
ggml_bert_new_tensor_impl: not enough space in the context's memory pool (needed 271388624, available 260703040)
To Reproduce
Expected behavior
Logs Too long to be in the issue, here is a link to the txt: https://hastebin.com/share/ukuvaranaw.rust
Additional context I have 16gb of RAM. Is there a way to not get this error, while not having to wait 30min to finish the embeddings ?
This is my Python code: