spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/index.html
Apache License 2.0
2.58k stars 627 forks source link

The SDK of the vector database is very inflexible, unable to modify some database fields. #513

Open Warrior0x1 opened 3 months ago

Warrior0x1 commented 3 months ago

In the current version, taking the Milvus database as an example, I cannot specify the field names of the database collection. If I have a previously used collection, and if the fields do not match the current fixed fields, I cannot perform a query. I strongly suggest that custom field names should be allowed!!!

    public static final String DOC_ID_FIELD_NAME = "doc_id";

public static final String CONTENT_FIELD_NAME = "content";

public static final String METADATA_FIELD_NAME = "metadata";

public static final String EMBEDDING_FIELD_NAME = "embedding";
Warrior0x1 commented 3 months ago

The cost of rebuilding a vector database that has already been put into production in order to use Spring AI is enormous.

iAMSagar44 commented 3 months ago

I faced a similar issue with Azure AI Search, where I had imported some data into the vector store using Azure's inbuilt solution for importing and chunking data into Azure AI Search. The import process resulted in different field names compared to the vector store implementations in Spring AI. And when I was trying to do a vector search using Spring AI, it failed. The only way that I could think of resolving the issue was to create your own implementation of the Vector Store and model the fields in the code with the same name as in Azure AI Search.

markpollack commented 2 months ago

Thanks for reporting this, sorry for the delay. @iAMSagar44 wrt to the comment "The import process resulted in different field names compared to the vector store implementations in Spring AI." Were the field names different or were the different fields. I agree, we need to open up the names that are used. A simple way would be to create setters for all the current field names, what do you think?