spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.35k stars 865 forks source link

Auto truncate for vertex embedding is broken by TokenCountBatchingStrategy #1831

Open GregoireW opened 1 day ago

GregoireW commented 1 day ago

Bug description

I try to use vertex embedding and did test big document. I did set the auto-truncate to true. This correspond to this options: https://github.com/spring-projects/spring-ai/blob/be0f9fbb676627ec6f64fb99ad9c9cf407ba8941/models/spring-ai-vertex-ai-embedding/src/main/java/org/springframework/ai/vertexai/embedding/text/VertexAiTextEmbeddingOptions.java#L67

But I got an exception from TokenCountBatchingStrategy ( https://github.com/spring-projects/spring-ai/blob/be0f9fbb676627ec6f64fb99ad9c9cf407ba8941/spring-ai-core/src/main/java/org/springframework/ai/embedding/TokenCountBatchingStrategy.java#L147 )

What to do in this situation ?

Environment

Spring AI 1.0.0-M4 / jdk21

Steps to reproduce

Use vertex embedding with the "auto truncate" option, and test with a large payload.

Expected behavior

Success or at least some way in the documentation to make it works.

Minimal Complete Reproducible example

var document = new Document("go ".repeat(50000)); vectorStore.add(List.of(document));

sobychacko commented 22 hours ago

@GregoireW, Which vector store are you using? To fix this issue, you need to provide a custom BatchingStrategy ben in your application. The default TokenCountBatchingStrategy implementation uses the default context-window size set by openai - 8191. You need to adjust the max token size when using different embedding models. Here is an example of overriding this bean:

@Bean
@ConditionalOnMissingBean(BatchingStrategy.class)
    BatchingStrategy chromaBatchingStrategy() {
      return new TokenCountBatchingStrategy(EncodingType.CL100K_BASE, maxInputTokenCount, 0.1);
}

See the javadoc on ToeknCountBatchingStrategy for more details: https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/embedding/TokenCountBatchingStrategy.java

GregoireW commented 20 hours ago

I use pgVector storage.

If the way is to create a BatchingStragery, I guess the documentation on the auto-truncate feature should be explicit.

Or it is still a work in progress to make it work with embedding models in which case the the case would be corrected later.

I let you close this issue or set a "todo something" if you want.