spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.33k stars 855 forks source link

odd issue creating embeddings with Ollama #840

Closed joshlong closed 4 months ago

joshlong commented 5 months ago

Hi, I am trying to use the Ollama EmbeddingModel with PgVectorStore and it's failing with

Caused by: org.springframework.jdbc.UncategorizedSQLException: StatementCallback; uncategorized SQLException for SQL [CREATE INDEX IF NOT EXISTS spring_ai_vector_index ON vector_store USING HNSW (embedding vector_cosine_ops)
]; SQL state [XX000]; error code [0]; ERROR: column cannot have more than 2000 dimensions for hnsw index
    at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1549) ~[spring-jdbc-6.1.8.jar:6

the same application works with OpenAI

here's a sample https://github.com/joshlong/bootiful-spring-boot-2024/tree/main/service

and there's a PgVector Docker Compose file here https://github.com/joshlong/bootiful-spring-boot-2024/blob/main/compose.yaml

help, please

ThomasVitale commented 5 months ago

@joshlong unfortunately, Ollama has a fixed 4096 size for the embeddings and it's not currently possible to customise the value. There is a feature request to do that: https://github.com/ollama/ollama/issues/651 I hope Ollama addresses this issue asap!

I have opened a PR to improve the Spring AI PGvector documentation and mention the limitation of the HNSW indexing strategy, which can't support dimensionality above 2000: https://github.com/spring-projects/spring-ai/pull/825/files

There is also an ongoing discussion in the PGvector project to raise the limit to at least 4096: https://github.com/pgvector/pgvector/issues/461.

My current workaround to get applications running fully locally is to use Ollama for the chat model and one of the ONXX Transformers for the embedding model: https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.html

markpollack commented 4 months ago

Thanks @ThomasVitale Closing the issue

ThomasVitale commented 3 months ago

In case it's useful to people landing on this issue and having the same problem, I'll share another tip to use Ollama for embeddings paired with PGVector.

The multi-purpose models in the Ollama library (like mistral and llama3.1) seem to have been configured with 4096 dimensionality, which cannot be changed as of now, and doesn't work with PGVector.

However, dedicated embedding models like nomic-embed-text (full list here: https://ollama.com/search?q=&c=embedding) are configured with different dimensionality. In many cases, it's lower than 2000, so they all can be used with PGVector (and Spring AI lets you configure the dimensions via spring.ai.vectorstore.pgvector.dimensions).

smitchell commented 3 months ago

@joshlong - I did "ollama pull all-minim" then used this in my properties file:

  ai:
    ollama:
      base-url: http://${OLLAMA_HOST}:11434
      chat:
        model: llama3.1:8b
      embedding:
        enabled: true
        model: all-minilm

You can find the code here: https://github.com/ByteworksHomeLab/spring-ai-lab

joshlong commented 3 months ago

Thanks all

Hyun-June-Choi commented 1 month ago

@smitchell

like below?

ai:
  ollama:
    base-url: http://${OLLAMA_HOST}:11434
    chat:
      model: llama3.1:8b
    embedding:
      enabled: true
      model: nomic-embed-text