spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.33k stars 857 forks source link

MilvusVectorStore embedding dimensions defaults to OpenAI size not allowing other larger models #608

Closed cc-bb-aa closed 6 months ago

cc-bb-aa commented 7 months ago

Bug description Milvus supports up to 4096 embedding size of 4096. Mistral dimension is 4096 https://huggingface.co/spaces/mteb/leaderboard . However MilvusVectorStore is hardcoded for OpenAI model size "Dimension has to be withing the boundaries 1 and 2048 (inclusively)"

Environment Spring AI version: 0.8.1, Java version: OpenJDK 21 Vector store: Milvus: 2.3.13 Model: Mistral

Steps to reproduce Spring boot configuration: spring.ai.vectorstore.milvus.embedding-dimension=4096 spring.ai.vectorstore.milvus.index-type=IVF_FLAT spring.ai.vectorstore.milvus.metric-type=cosine spring.ai.vectorstore.milvus.database-name=default spring.ai.vectorstore.milvus.collection-name=vector_store

Expected behavior Setting embedding-dimension=4096 should overwrite default dimension. Assertion should be changed from: Assert.isTrue(newEmbeddingDimension >= 1 && newEmbeddingDimension <= 2048, "Dimension has to be withing the boundaries 1 and 2048 (inclusively)");

to

Assert.isTrue(newEmbeddingDimension >= 1 && newEmbeddingDimension <= 4096, "Dimension has to be withing the boundaries 1 and 4096 (inclusively)");

Additionally MilvusVectorStore should not have hardcoded OpenAI specific properties and values i.e. "public static final int OPENAI_EMBEDDING_DIMENSION_SIZE = 1536;"

Minimal Complete Reproducible example ConfigServletWebServerApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'ragOllamaAIClient' defined in file [/app/classes/llm/RagOllamaAIClient.class]: Unsatisfied dependency expressed through constructor parameter 0: Error creating bean with name 'vectorStore' defined in class path resource [org/springframework/ai/autoconfigure/vectorstore/milvus/MilvusVectorStoreAutoConfiguration.class]: Failed to instantiate [org.springframework.ai.vectorstore.MilvusVectorStore]: Factory method 'vectorStore' threw exception with message: Dimension has to be withing the boundaries 1 and 2048 (inclusively)

tzolov commented 7 months ago

Tanks @cc-bb-aa , indeed the upper dimension boundary is incorrect. We will remove it.

Regarding the

Additionally MilvusVectorStore should not have hardcoded OpenAI specific properties and values i.e. "public static final int OPENAI_EMBEDDING_DIMENSION_SIZE = 1536;" What is your suggestion? Not to have any default dimension at all?

cc-bb-aa commented 7 months ago

Hi,

Thank you for the quick response. I was more thinking that MilvusVectorStore was generic, not catering specifically to OpenAI, but can be used for other LLMs. Am I mistaken by any chance?

If I am mistaken, could you maybe suggest what can be used with Milvus and any other LLMs? Is it the case that this support is yet to be developed or have I missed something obvious? Apologies, if I have.

many thanks!

tzolov commented 6 months ago

@cc-bb-aa the OPENAI_EMBEDDING_DIMENSION_SIZE is just a common default value you can always override it with whatever dimension you need.