Closed kevintanhongann closed 5 days ago
Hi @kevintanhongann by default the dimension value is set to 1536, you can provide a value according to your needs, by default the dimension value is set to 1586, you can provide a value according to your needs, you must provide it in the constructor, here is an example : new PgVectorStore(jdbcTemplate, embeddingClient, 4096);
@ricken07 thanks for the info.
@ricken07 it doesn't seem like the constructor registers the dimensions. Still outputs the same error. tried both numbers 1536 and 4096.
@kevintanhongann you need to rebuild your database to take new dimensions value effect or modify it manually, embedding field vector dimensions in database.
Note: The dimensions are taken effect on first time to create vector_store table (by default)
@kevintanhongann
If you are using the PGVectorStore Boot Starter, then you can set the spring.ai.vectorstore.pgvector.dimensions=4096
property.
Also the PgVectorStore
implementation will retrieve the dimensions from the EmbeddingClient
if later are not explicitly provided.
But as mentioned above once, you have created your Vector_Store table the embedding column has fixed the dimensions size and you will like have to re-create your table to change it.
It is advised to do this manually but you can also try the spring.ai.vectorstore.pgvector.remove-existing-vector-store-table=true
property. Just don't forget to remove the property after!
I see that the PGVecorStore documentation is incomplete and i'm working to update it very soon. Let me know if this helps?
@kevintanhongann, fyi I've updated the pgvector store docs: https://docs.spring.io/spring-ai/reference/0.8-SNAPSHOT/api/vectordbs/pgvector.html
@tzolov apparently HNSW cannot allow for a higher number than 1536 it seems. Tried adjusting that to 4096 and this is what I got
CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops);
Caused by: liquibase.exception.DatabaseException: ERROR: column cannot have more than 2000 dimensions for hnsw index [Failed SQL: (0) CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops)]
at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:468)
at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:77)
at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:177)
at liquibase.database.AbstractJdbcDatabase.execute(AbstractJdbcDatabase.java:1291)
at liquibase.database.AbstractJdbcDatabase.executeStatements(AbstractJdbcDatabase.java:1273)
at liquibase.changelog.ChangeSet.execute(ChangeSet.java:744)
... 47 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: column cannot have more than 2000 dimensions for hnsw index
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:341)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:326)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:302)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:297)
at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94)
at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:462)
... 52 common frames omitted
Can I use another kind of vector index?
@tzolov
Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.
Created a feature request https://github.com/spring-projects/spring-ai/issues/385
@kevintanhongann as you can see this is PGVector not Spring AI issue.
There is a PGVector open issue on the subject: Increase max vectors dimension limit for index #461
So the feature request should go there I guess.
Regarding
Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.
I see that the `uuid_generate_v4 ()` has an extras space, but this doesn't seem to matter as our Integration tests work as expected:
postgres=# \dt List of relations Schema | Name | Type | Owner --------+--------------+-------+---------- public | vector_store | table | postgres (1 row)
postgres=# \d vector_store Table "public.vector_store" Column | Type | Collation | Nullable | Default -----------+--------------+-----------+----------+-------------------- id | uuid | | not null | uuid_generate_v4() content | text | | | metadata | json | | | embedding | vector(1536) | | | Indexes: "vector_store_pkey" PRIMARY KEY, btree (id) "vector_store_embedding_idx" hnsw (embedding vector_cosine_ops)
I will remove the extra space, though this apparently is not an issue.
@kevintanhongann , If you don't want to change the Vector Store, one possible alternative is to disable the OllamaEmbeddingClient
(while still using OllamaChatClient
) and opt for one of the other EmbeddingClients that have dimensions compatible with PGVector's limitations.
For example you can:
spring.ai.ollama.embedding.enabled=false
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>
Mind that if you add some of the boot starters that configure both chat-client and embedding-clients (like OpenAI, Auzre OpenAI, ...) you have to disable the chat client counterpart using the corresponding properties.
For example for OpenAI you have to set spring.ai.openai.chat.enabled=false
, which will live on only the OpenAIEmbeddingClient
.
Check the ref docs for the enable/disable property of other Chat Clients.
Hope this viable alternative?
@tzolov I will try that out ASAP. Thanks for the pointers.
The documentation for Spring AI PGVector has been updated to include the limitation of the HNSW indexes: https://docs.spring.io/spring-ai/reference/api/vectordbs/pgvector.html#_prerequisites
@kevintanhongann Can this issue be closed?
Yeah sure.
Bug description
I notice that for Neo4j vector store can set the embedding dimensions, but pgvector doesnt have that application properties/yml config yet.
Environment Spring AI version 0.8.1-SNAPSHOT, Java version 21, PgVectorStore, OllamaEmbeddingClient, OllamaChatClient
Steps to reproduce add a list of Spring AI Documents then vectorStore.add(springAiDocuments);