Open shuther opened 4 months ago
Hello, the encoder model is not the GenerativeAI model. The encoder model takes text and generates a single number as the output and we only support HuggingFace encoders at the moment.
For using Ollama Mistral you'll want to follow this guide: https://docs.danswer.dev/gen_ai_configs/ollama
Will also just mention that your experience will be significantly better with GPT4 or GPT4-Turbo so do try those out if you get a chance!
This is what I did, so not sure which step I missed? I understood also that we refer to the encoder, not GenerativeAI (and experience could be lower), but I expect ollama to be able to run the embedding (see: https://python.langchain.com/docs/integrations/text_embedding/ollama). With curl I was able to generate the vector at least. See my ticket on Ollama. Are you saying that the embedding today is only working through openAI?
Hello! I see, we don't currently support Ollama embeddings. We also don't use OpenAI embeddings. We use locally running models using the sentence transformers library: https://huggingface.co/sentence-transformers.
Would love to know more about your use case and if Ollama embedding is critical for your deployment, please DM me/Chris in our Slack: https://join.slack.com/t/danswer/shared_invite/zt-2afut44lv-Rw3kSWu6_OmdAXRpCv80DQ
For scalability reason, running the embeddings within the main platform is a problem; by itself, danswer doesn't need GPU and running embeddings is a low priority vs responsiveness. Maybe using Infinity as an API would solve the problem as we can run it on the same machine or in a different one?
While I set _DOCUMENT_ENCODERMODEL to mistral (should it be ollama/mistral?), danswer still thinks it should load the model from HF. Is there a way to force him to connect to an external endpoint?
Error in the indexing screen:
Full trace: