Bug description
Currently, we are using the Bedrock Cohere embedding model via Spring AI.
It is stated, in the docs that the default option for "truncation" is NONE.
What this implies in practice is that when we have a given chunk we want to embed that is longer than the allowed 2048 characters by the underlying Cohere API, that will result in an error.
An option to circumvent that is to configure the embeddingModel client in such a way that we made a certain truncation strategy the default one. In our case, we need to be "behind a VPC" so, we simply used a "custom" client that exposes a URL to configure that, but,left everything else as-is.
@Bean(name = "cohereEmbeddingModel")
@ConditionalOnProperty("spring.ai.bedrock.cohere.embedding.enabled")
public EmbeddingModel cohereEmbeddings() {
log.info("Configured Cohere embedding model with VPC connection");
return new CustomBedrockCohereEmbeddingModel(
new CustomCohereEmbeddingBedrockApi(
embeddingModel,
DefaultCredentialsProvider.create(),
awsRegion,
new ObjectMapper(),
bedrockRuntimeClient(),
bedrockRuntimeAsyncClient()),
CustomBedrockCohereEmbeddingOptions.builder()
.withInputType(SEARCH_DOCUMENT)
.withTruncate(END)
.build());
}
You can see that the options we pass to the model include a default truncation strategy to remove the end of a chunk if its longer than the 2048 characters limit.
Now, what should happen I believe is: when we call the actual model with a chunk, such as:
embeddingModel.embed("SomeString".repeat(2048))
That the call would work while the actual string being embedded would be truncated to have exactly 2048 characters in length while removing the characters from the END, as specified in the client configuration above.
However, what happens is that this results in an exception:
EmbeddingGenerationService : There was an error invoking embedding model: Malformed input request: #/texts/0: expected maxLength: 2048, actual: 2363, please reformat your input and try again. (Service: BedrockRuntime, Status Code: 400)
What we ended up doing was manually truncating our input chunks if they are too large as well as playing with the chunking parameters a little bit although the expectancy would be that the inputs would follow the truncation strategy as defined in the client above without the need to truncate it manually ourselves?
Environment
Java 21, Springboot 3.3.0, Spring AI 1.0.0-M3
Steps to reproduce
Sending an embedding request with an input string longer than 2048 characters triggers the error as truncation doesn't happen.
Expected behavior
The chunk truncation should happen under the hood.
Minimal Complete Reproducible example
Configuring the client as above, then sending a request to embed a string with length > 2048 is enough to trigger the error.
Bug description Currently, we are using the Bedrock Cohere embedding model via Spring AI.
It is stated, in the docs that the default option for "truncation" is
NONE
.What this implies in practice is that when we have a given chunk we want to embed that is longer than the allowed 2048 characters by the underlying Cohere API, that will result in an error.
An option to circumvent that is to configure the embeddingModel client in such a way that we made a certain truncation strategy the default one. In our case, we need to be "behind a VPC" so, we simply used a "custom" client that exposes a URL to configure that, but,left everything else as-is.
You can see that the options we pass to the model include a default truncation strategy to remove the end of a chunk if its longer than the 2048 characters limit.
Now, what should happen I believe is: when we call the actual model with a chunk, such as:
That the call would work while the actual string being embedded would be truncated to have exactly 2048 characters in length while removing the characters from the END, as specified in the client configuration above.
However, what happens is that this results in an exception:
What we ended up doing was manually truncating our input chunks if they are too large as well as playing with the chunking parameters a little bit although the expectancy would be that the inputs would follow the truncation strategy as defined in the client above without the need to truncate it manually ourselves?
Environment Java 21, Springboot 3.3.0, Spring AI 1.0.0-M3
Steps to reproduce Sending an embedding request with an input string longer than 2048 characters triggers the error as truncation doesn't happen.
Expected behavior The chunk truncation should happen under the hood.
Minimal Complete Reproducible example Configuring the client as above, then sending a request to embed a string with length > 2048 is enough to trigger the error.