spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.16k stars 786 forks source link

向量库返回的 Document 中 的 distance 字段应统一 #1568

Open yyqqing opened 6 days ago

yyqqing commented 6 days ago

Bug description 我们将向量库由 chromadb 转为 redis-stack,从 vectorStore.similaritySearch 返回的 Document 中,无法 从 Metadata 中获取 distance 值。

Environment Please provide as many details as possible: Spring AI version, Java version, which vector store you use if any, etc Springboot 3.3.2 SpringAI 1.0.0.SNAPSHOT Java 17 RedisStack 6.2.6

Steps to reproduce

  1. 使用 chromadb 做向量库,vectorStore.similaritySearch 返回的Document 的 Metadata 中,distance 是表示距离的 键名
  2. 使用 redis 做向量库,vectorStore.similaritySearch 返回的Document 的 Metadata 中,vector_score 是表示距离的 键名
  3. 分别跟踪进入 org.springframework.ai.vectorstore.RedisVectorStore 和 org.springframework.ai.vectorstore.ChromaVectorStore 类中,发现 DISTANCE_FIELD_NAME 声明 对应的值 不一样

Expected behavior 希望 Metadata 中表示距离的 key 是统一的、确定的,避免切换向量库时代码不兼容

Minimal Complete Reproducible example 请见重现步骤部分

另外:我没有用过 Milvus,但猜测 #1256 可能与此有一点点关系

tzolov commented 2 days ago

Hey @yyqqing , could you please update your description in English.

hYuang commented 1 day ago

Hey @yyqqing , could you please update your description in English.

when we use different vector db . use vectorStore.similaritySearch method to get the distance key is not same key . eg in in the RedisVectorStore key vector_score means DISTANCE_FIELD_NAME , while in ChromaVectorStore means DISTANCE_FIELD_NAME .it is a little odd.

yyqqing commented 7 hours ago

Hey @yyqqing , could you please update your description in English.

My English is not good, and here is a bug report that I converted using Google Translate, which I checked and described what I meant more or less accurately:

Bug description We switched the vector library from chromadb to redis-stack. In the Document returned by vectorStore.similaritySearch, we were unable to obtain the distance value from Metadata.

Environment Please provide as many details as possible: Spring AI version, Java version, which vector store you use if any, etc Springboot 3.3.2 SpringAI 1.0.0.SNAPSHOT Java 17 RedisStack 6.2.6

Steps to reproduce Using chromadb as the vector library, in the Metadata of the Document returned by vectorStore.similaritySearch, distance is the key name representing distance Using redis as the vector library, in the Metadata of the Document returned by vectorStore.similaritySearch, vector_score is the key name representing distance Tracking into the org.springframework.ai.vectorstore.RedisVectorStore and org.springframework.ai.vectorstore.ChromaVectorStore classes respectively, it is found that the corresponding values ​​of the DISTANCE_FIELD_NAME declaration are different

Expected behavior We hope that the key representing the distance in Metadata is unified and fixed to avoid code incompatibility when switching vector libraries.

Minimal Complete Reproducible example See the Steps to Reproduce section.

Aside: I haven’t used Milvus, but I guess #1256 may be a little bit related to this.

In other words, @hYuang said what I meant.

Hey @yyqqing , could you please update your description in English.

when we use different vector db . use vectorStore.similaritySearch method to get the distance key is not same key . eg in in the RedisVectorStore key vector_score means DISTANCE_FIELD_NAME , while in ChromaVectorStore means DISTANCE_FIELD_NAME .it is a little odd.