[BUG]: Embedding ollama+llama2-chinese , chinese docx failed

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

https://anythingllm.com

MIT License

26.54k stars 2.65k forks source link

[BUG]: Embedding ollama+llama2-chinese , chinese docx failed #1002

Closed wilsonlv closed 6 months ago

wilsonlv commented 7 months ago

How are you running AnythingLLM?

Docker (local)

What happened?

微信图片_20240401162520 微信图片_20240401162525

Are there known steps to reproduce?

No response

timothycarambat commented 7 months ago

It would seem to indicate that the Ollama embedder returned a zero-length vector. Can you confirm if this embedder is processing text chunks properly manually first?

wilsonlv commented 7 months ago

Yea, if I add a English docx，it successes

timothycarambat commented 6 months ago

Closing as wontfix, I did not notice prior but you have an LLM chat model llama2-chinese as your embedding model. LLMs cannot embed text. Please use an embedding model like nomic-text-embed or mxbai for Chinese lang support embeddings