QuivrHQ / quivr

Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework
https://quivr.com
Other
34.06k stars 3.34k forks source link

feat: introducing Ollama embeddings properties. #2690

Open mkhludnev opened 2 weeks ago

mkhludnev commented 2 weeks ago

Description

Ollama embeddings should be properly configured via these props. Now only base_url is passed to OllamaEmbeddings. It causes the following issues:

This change let users to configure embeddings models which is hosted in Ollama.

Checklist before requesting a review

Please delete options that are not relevant.

Screenshots (if appropriate):

vercel[bot] commented 2 weeks ago

Someone is attempting to deploy a commit to the Quivr-app Team on Vercel.

A member of the Team first needs to authorize it.

StanGirard commented 2 weeks ago

Thanks a lot for this PR! Being on holiday (still looking at PR) @AmineDiro will review the PR ;)

filipe-omnix commented 1 week ago

Hi could you please provide an example of changes to the env ? (OLLAMAEMBEDDINGS* part please). Thanks in advance!

mkhludnev commented 1 week ago

example of changes to the env ? (OLLAMAEMBEDDINGS* part please).

fair! Here's my props

OLLAMA_EMBEDDINGS_MODEL=chatfire/bge-m3:q8_0 # just because we deployed this embeddings model, choose yours
OLLAMA_EMBEDDINGS_DOC_INSTRUCT=              # just because there are certain values by default https://github.com/langchain-ai/langchain/blob/c314222796798545f168f6ff7e750eb24e8edd51/libs/community/langchain_community/embeddings/ollama.py#L40
OLLAMA_EMBEDDINGS_QUERY_INSTRUCT=            # but instructions are not necessary for bge-m3 see faq#2 https://huggingface.co/BAAI/bge-m3#faq
mkhludnev commented 2 days ago

@filipe-omnix can you confirm if this patch is useful for you?

andyzhangwp commented 2 days ago

@mkhludnev

I applied the above update, but still encountered an error during local testing: {"error": "model 'llama2' not found, try pulling it first"}

The following are debugging logs:

1、get_embeddings of models/setting.py , mode is llama3: DEBUG:httpcore.http11:response_closed.complete backend-core | ======get_embeddings=====embeddings=[base_url='http://33a45d4e.r11.cpolar.top' model='llamafamily/llama3-chinese-8b-instruct' embed_instruction='passage:' query_instruction='query:' mirostat=None mirostat_eta=None mirostat_tau=None num_ctx=None num_gpu=None num_thread=None repeat_last_n=None repeat_penalty=None temperature=None stop=None tfs_z=None top_k=None top_p=None show_progress=False headers=None model_kwargs=None]

Here you can see that the model is llama3, indicating that the configuration is valid.

2、similarity_search of vectorstore/supabase.py, :  ====111======similarity_search=====self._embedding=[base_url='http://33a45d4e.r11.cpolar.top' model='llama2' embed_instruction='passage: ' query_instruction='query: ' mirostat=None mirostat_eta=None mirostat_tau=None num_ctx=None num_gpu=None num_thread=None repeat_last_n=None repeat_penalty=None temperature=None stop=None tfs_z=None top_k=None top_p=None show_progress=False headers=None model_kwargs=None]

The model here has been changed to llama2 again, and the previous embeddings have not been used

3、error log :

backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ backend-core | | File "/code/vectorstore/supabase.py", line 76, in similarity_search backend-core | | vectors = self._embedding.embed_documents([query]) backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 211, in embed_documents backend-core | | embeddings = self._embed(instruction_pairs) backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 199, in _embed backend-core | | return [self._process_embresponse(prompt) for prompt in iter] backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 199, in backend-core | | return [self._process_embresponse(prompt) for prompt in iter] backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 173, in _process_emb_response backend-core | | raise ValueError( backend-core | | ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'llama2' not found, try pulling it first"}