Closed mudler closed 1 year ago
Just curious to find out what is the use/purpose of embeddings above.
For the following use case of Retrieval Augmented Data QA: https://blog.langchain.dev/tutorial-chatgpt-over-your-data/
Can't we use the following embedding models? I plan to use gpt4all-j
with one of the following embeddings model.
Please advise. Thank you.
embeddings support has been merged to master. It is experimental and currently it's available only on llama.cpp
based models, so any feedback is more than welcome!
To enable it you can set embeddings: true
in the model's YAML config file
I've published a sample using embeddings over here: https://github.com/go-skynet/LocalAI/tree/master/examples/query_data
further optimizations in https://github.com/go-skynet/LocalAI/pull/222 - now embeddings can be used with bert on any model - and there is also a huge performance impact!
Hello! I am trying to run a gpt4all-j model for building a local chatbot. How can I use an embedding using BERT and implement it for chat completions endpoint?
Currently, I am running it on Mac Mini i7, 32gb RAM. I am planning to upgrade it to a higher resource(vRAM) cloud server in future. Is it possible to make a fast chatbot API using own document embeddings?
https://github.com/go-skynet/LocalAI/tree/master/examples/query_data
Thank you for the example! But it can't be included in the API? Currently I think you run those commands inside the container, right? Is there already the scenario that calling a certain path executes the query on the documents?
Add support to embeddings to the API and the llama backend: https://github.com/ggerganov/llama.cpp/blob/e4422e299c10c7e84c8e987770ef40d31905a76b/llama.cpp#L2160