Closed joshlong closed 4 months ago
@joshlong unfortunately, Ollama has a fixed 4096 size for the embeddings and it's not currently possible to customise the value. There is a feature request to do that: https://github.com/ollama/ollama/issues/651 I hope Ollama addresses this issue asap!
I have opened a PR to improve the Spring AI PGvector documentation and mention the limitation of the HNSW indexing strategy, which can't support dimensionality above 2000: https://github.com/spring-projects/spring-ai/pull/825/files
There is also an ongoing discussion in the PGvector project to raise the limit to at least 4096: https://github.com/pgvector/pgvector/issues/461.
My current workaround to get applications running fully locally is to use Ollama for the chat model and one of the ONXX Transformers for the embedding model: https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.html
Thanks @ThomasVitale Closing the issue
In case it's useful to people landing on this issue and having the same problem, I'll share another tip to use Ollama for embeddings paired with PGVector.
The multi-purpose models in the Ollama library (like mistral
and llama3.1
) seem to have been configured with 4096 dimensionality, which cannot be changed as of now, and doesn't work with PGVector.
However, dedicated embedding models like nomic-embed-text
(full list here: https://ollama.com/search?q=&c=embedding) are configured with different dimensionality. In many cases, it's lower than 2000, so they all can be used with PGVector (and Spring AI lets you configure the dimensions via spring.ai.vectorstore.pgvector.dimensions
).
@joshlong - I did "ollama pull all-minim" then used this in my properties file:
ai:
ollama:
base-url: http://${OLLAMA_HOST}:11434
chat:
model: llama3.1:8b
embedding:
enabled: true
model: all-minilm
You can find the code here: https://github.com/ByteworksHomeLab/spring-ai-lab
Thanks all
@smitchell
like below?
ai:
ollama:
base-url: http://${OLLAMA_HOST}:11434
chat:
model: llama3.1:8b
embedding:
enabled: true
model: nomic-embed-text
Hi, I am trying to use the Ollama
EmbeddingModel
withPgVectorStore
and it's failing withthe same application works with OpenAI
here's a sample https://github.com/joshlong/bootiful-spring-boot-2024/tree/main/service
and there's a PgVector Docker Compose file here https://github.com/joshlong/bootiful-spring-boot-2024/blob/main/compose.yaml
help, please