spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.32k stars 850 forks source link

Bedrock API read timeout in 2 minutes, even I set it as 10 #1061

Open ShaunDeng opened 4 months ago

ShaunDeng commented 4 months ago

Bug description It seems like the timeout settings not working for class Anthropic3ChatBedrockApi,

Environment Spring AI 1.0.0-M1, Java 17,

Steps to reproduce 1 Set the read timeout to 10 minutes IAWSService.AWSBedrockLoginInfo awsLoginInfo = convertLoginInfo(loginInfo); Anthropic3ChatBedrockApi anthropicApi = new Anthropic3ChatBedrockApi( aiModelEnum.getRawCode(), StaticCredentialsProvider.create(AwsBasicCredentials.create(awsLoginInfo.getAccessKey(),awsLoginInfo.getSecretKey())), awsLoginInfo.getRegion(), JsonUtils.defaultObjectMapper(), Duration.ofMinutes(10L)); 2 Call with large context BedrockAnthropic3ChatModel model = BedrockAnthropic3ChatModel(anthropicApi); model .call(new Prompt(messages));

3 Get the read timeout errors after about 2minutes 'Caused by: java.net.SocketTimeoutException: Read timed out at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:288) at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:314) at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:355) at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:808) at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:484) at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:478) at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1465) at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1069) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at software.amazon.awssdk.http.apache.internal.impl.ApacheSdkHttpClient.execute(ApacheSdkHttpClient.java:72) at software.amazon.awssdk.http.apache.ApacheHttpClient.execute(ApacheHttpClient.java:254) at software.amazon.awssdk.http.apache.ApacheHttpClient.access$500(ApacheHttpClient.java:104) at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:231) at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:228) at software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:99) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.executeHttpRequest(MakeHttpRequestStage.java:79) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:57) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:40) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.executeRequest(RetryableStage2.java:93) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:56) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:36) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74) at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53) at software.amazon.awssdk.services.bedrockruntime.DefaultBedrockRuntimeClient.invokeModel(DefaultBedrockRuntimeClient.java:150) at org.springframework.ai.bedrock.api.AbstractBedrockApi.internalInvocation(AbstractBedrockApi.java:253) at org.springframework.ai.bedrock.anthropic3.api.Anthropic3ChatBedrockApi.chatCompletion(Anthropic3ChatBedrockApi.java:490) at org.springframework.ai.bedrock.anthropic3.BedrockAnthropic3ChatModel.call(BedrockAnthropic3ChatModel.java:79) at com.lenovo.tec.ai.codeg.gateway.plugins.impl.text.aws.AWSAnthropicChatPluginsServiceImpl.call(AWSAnthropicChatPluginsServiceImpl.java:110) at com.lenovo.tec.ai.codeg.gateway.plugins.impl.text.aws.AWSAnthropicChatPluginsServiceImpl.lambda$doChat$0(AWSAnthropicChatPluginsServiceImpl.java:96) at reactor.core.publisher.MonoCallable.call(MonoCallable.java:72) at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:228) at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68) at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28) at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) '

Expected behavior The connection should not timeout until 10 minutes have passed since I set it to 10 minutes.

My Thinking Regarding AWS Java2 SDK best practices, they recommend setting both the apiCallAttemptTimeout and apiCallTimeout arguments (Please refer to the first picture below), but in Spring AI, only apiCallTimeout is set (Please refer to the second picture below). I'm not sure if this impacts. Picture 1 image Picture 2 image

codespearhead commented 4 months ago

Is this issue still reproducible when you use Spring Boot's autoconfiguration [3] (spring.ai.bedrock.aws.timeout=10m) instead of creating the chat client yourself?

From my understanding, that constructor [1], which was introduced by https://github.com/spring-projects/spring-ai/pull/520 via 5c3ed1152a7a9318d431183b42f5ebf37a99ee90 , should've worked as expected [2].


Possibly related: https://github.com/aws/aws-sdk-java/issues/3072 and https://github.com/aws/aws-sdk-ruby/issues/2967


[1] https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/best-practices.html [2] https://github.com/spring-projects/spring-ai/blob/17c44237a51f85a30d128c21a335a7aee269f4c2/models/spring-ai-bedrock/src/main/java/org/springframework/ai/bedrock/api/AbstractBedrockApi.java#L151-L161 [3] https://docs.spring.io/spring-ai/reference/api/chat/bedrock/bedrock-anthropic3.html#_sample_controller

impactCn commented 4 months ago

I tested it when I submitted it and it was ok. Have you ever tested using bean injection? @ShaunDeng

ShaunDeng commented 4 months ago

@impactCn , Thanks for your comments. Well when I use auto bean injection(spring-ai-bedrock-ai-spring-boot-starter), the timeout issue is not reproducible even with the default 5m timeout. But I need manually create(spring-ai-bedrock) those client/model instances by myself due to some purpose.

ShaunDeng commented 4 months ago

eh, I found another thing. When I directly use the api/client/model instance after manually created, it works fine. But when I created the api/client/model object, for example, Anthropic3ChatBedrockApi, I register it into spring application context, like below [1], and then get it back from spring application context to use it [2]. Then the issue is reproduced [1] `String beanName = IPluginsService.genDynamicAIChannelBeanName(channelId, modelCode); AWSBedrockLoginInfo awsLoginInfo = convertLoginInfo(loginInfo); Anthropic3ChatBedrockApi anthropicApi = new Anthropic3ChatBedrockApi( aiModelEnum.getRawCode(), StaticCredentialsProvider.create(AwsBasicCredentials.create(awsLoginInfo.getAccessKey(),awsLoginInfo.getSecretKey())), awsLoginInfo.getRegion(), JsonUtils.defaultObjectMapper(), timeout);

bean = applicationContext.getAutowireCapableBeanFactory().initializeBean(bean, beanName); DefaultListableBeanFactory beanFactory = (DefaultListableBeanFactory) applicationContext.getAutowireCapableBeanFactory(); beanFactory.registerSingleton(beanName, bean);`

[2]

Anthropic3ChatBedrockApi anthropicApi = beanRegistrar.getBean(IPluginsService.genDynamicAIChannelBeanName(channelId, modelCode), Anthropic3ChatBedrockApi.class); ... ... class beanRegistrar{ public <T> T getBean(String beanName, Class<T> type) { Object bean = context.getBean(beanName); if (type.isInstance(bean)) { return type.cast(bean); } return null; } }

impactCn commented 4 months ago

Let me test it.

ShaunDeng commented 4 months ago

@impactCn Thank you, more info. I passed in a objectMapper, not sure if it's impacting, below is my object mapper:

JavaTimeModule javaTimeModule = new JavaTimeModule();

DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern(TimeUtils.DATETIME_FORMAT_PATTERN); javaTimeModule.addSerializer(LocalDateTime.class, new LocalDateTimeSerializer(dateTimeFormatter)); javaTimeModule.addDeserializer(LocalDateTime.class, new LocalDateTimeDeserializer(dateTimeFormatter));

DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(TimeUtils.DATE_FORMAT_PATTERN); javaTimeModule.addSerializer(LocalDate.class, new LocalDateSerializer(dateFormatter)); javaTimeModule.addDeserializer(LocalDate.class, new LocalDateDeserializer(dateFormatter));

javaTimeModule.addSerializer(OffsetDateTime.class, OffsetDateTimeSerializer.INSTANCE); objectMapper = new ObjectMapper() .disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS) .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false) .setSerializationInclusion(com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL) .registerModule(javaTimeModule);

kalcifield commented 2 months ago

I had the same issue, I could fix it by overriding the SdkHttpClient with an instance that had raised socket timeout

BedrockRuntimeClient.builder()
                .region(Region.of(properties.getRegion()))
                .credentialsProvider(credentialsProvider)
                .httpClient(ApacheHttpClient.builder()
                        .socketTimeout(properties.getTimeout())
                        .build())
                .overrideConfiguration(c -> c.apiCallTimeout(properties.getTimeout()))
                .build();

`

software.amazon.awssdk
<artifactId>apache-client</artifactId>

`

bronwyn-damm commented 1 month ago

I am seeing the same thing, where it is timing out after 30 seconds and retrying 4 times, leading to a read timeout after 2 minutes. @kalcifield the constructor for Anthropic3ChatBedrockApi doesn't allow for you to configure your own BedrockRuntimeClient, where did you place this code to fix the issue?

kalcifield commented 1 month ago

@bronwyn-damm Technically, I have a custom ChatModel implementation (BedrockAnthropic3CustomChatModel), mainly based on this PR, because I needed ConverseApi for my use case. In this case BedrockConverseApi.java accepts a BedrockRuntimeClient bean.

dpresonate commented 3 hours ago

Any updates on this issue, I am running into the same 2min timeout problem no matter what value I put into the spring.ai.bedrock.aws.timeout property. It is not ideal that you have to create your own bean in order for this to actually be set properly.

bronwyn-damm commented 1 hour ago

We solved it in our project through using spring.ai's streaming option since the timeout is only triggered if the first token takes longer than 30 seconds or so. This was sufficient for our use case, but we are looking forward to being able to use the blocking API and removing that complexity from our code.