Closed JoshuaFurman closed 2 months ago
Looks as though you are boxed in to use either ADA embedding from OpenAI or MiniLM or Cohere as well...
Yes, right now there is no support for Mixtral models! But great point, we'll look into that for the next update
I got this working end to end but had to make some changes to be able to use my custom embedding model server. I submitted a PR for the changes I needed to be able to use OpenAI compatible API server for both embeddings and LLM: https://github.com/weaviate/Verba/pull/148
I plan to publish an end to end tutorial that runs on K8s to install Verba, Weaviate, an LLM and an embedding model server all within the same K8s cluster. Stay tuned!
I finished writing my guide for end-to-end private Verba RAG using Weaviate, Lingo, vLLM + Mistral 7b v2 and Sentence Transformers: https://www.substratus.ai/blog/lingo-weaviate-private-rag
Looking forward to hearing feedback. The guide should help you with figuring out how to use vanilla vLLM with Verba too.
In my lab environment i am serving mixtral with VLLM using their OpenAI API compatible server and I'm hosting a weaviate instance as well.
I just spun up verba, pointing to both my weaviate instance and VLLM instance using the .env file and connection to weaviate seems to be all good. I can see my schema and object count in the status tab but any queries i make seem to break... Unsure if this is a limitation on being able to handle models other than GPT-3.5 || GPT-4 being served from OpenAI.
Has anyone been able to configure a setup like this?
Thanks!