Provide the ability to configure client timeouts

thesurlydev commented 9 months ago

Expected Behavior

Spring properties are exposed to control timeouts used by the clients.

Current Behavior

There's no documentation or properties to control things like connect, read, and write timeouts.

Context

The default timeout configuration results in some requests timing out. Here's an example:

org.springframework.web.client.ResourceAccessException: I/O error on POST request for "https://api.openai.com/v1/chat/completions": timeout
    at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.createResourceAccessException(DefaultRestClient.java:549) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:474) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.retrieve(DefaultRestClient.java:439) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:655) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
    at org.springframework.ai.openai.OpenAiChatClient.chatCompletionWithTools(OpenAiChatClient.java:337) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
    at org.springframework.ai.openai.OpenAiChatClient.lambda$call$1(OpenAiChatClient.java:151) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:335) ~[spring-retry-2.0.5.jar:na]
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:211) ~[spring-retry-2.0.5.jar:na]
    at org.springframework.ai.openai.OpenAiChatClient.call(OpenAiChatClient.java:147) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
...

Caused by: java.net.SocketTimeoutException: timeout
    at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.kt:675) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.kt:684) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.kt:143) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http2.Http2ExchangeCodec.readResponseHeaders(Http2ExchangeCodec.kt:97) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201) ~[okhttp-4.12.0.jar:na]
    at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154) ~[okhttp-4.12.0.jar:na]
    at org.springframework.http.client.OkHttp3ClientHttpRequest.executeInternal(OkHttp3ClientHttpRequest.java:95) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.http.client.AbstractStreamingClientHttpRequest.executeInternal(AbstractStreamingClientHttpRequest.java:70) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:66) ~[spring-web-6.1.3.jar:6.1.3]
    at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:468) ~[spring-web-6.1.3.jar:6.1.3]

thingersoft commented 9 months ago

Quick temporary workaround:

        ClientHttpRequestFactorySettings requestFactorySettings = new ClientHttpRequestFactorySettings(
                connectTimeout , readTimeout , SslBundle.of(null));
        ClientHttpRequestFactory requestFactory = ClientHttpRequestFactories.get(requestFactorySettings);
        Builder clientBuilder = RestClient.builder().requestFactory(requestFactory);
        OpenAiApi openAiApi = new OpenAiApi("https://api.openai.com", openAiToken, clientBuilder);
        openAiChatClient = new OpenAiChatClient(openAiApi);

making commented 9 months ago

you can configure timeout settings easier with RestClientCustomizer.

    @Bean
    public RestClientCustomizer restClientCustomizer() {
        return restClientBuilder -> restClientBuilder
                .requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS
                        .withConnectTimeout(Duration.ofSeconds(1))
                        .withReadTimeout(Duration.ofSeconds(2))));
    }

This works not only for OpenAI but also for clients that use RestClient under the hood.

thingersoft commented 9 months ago

@making Acting on client at configuration level would affect the context-provided RestClientBuilder. So customization would apply wherever you injected a RestClientBuilder in your application and for every spring boot starter using it. That's why I proposed a programmatic approach as a workaround.

With bigger models like GPT-4 response to non-streamed requests often takes more than 1 minute to come. So I think that read timeout deserves a dedicated configuration property, avoiding the need for workarounds.

Anyway any comments or criticism is appreciated.

making commented 9 months ago

@thingersoft Kind of agree. In the app side, timeouts can also be overridden via the builder. Timeouts are a concern not only for OpenAI but also across clients, so if the dedicated property is created at Spring AI level, I would expect that the design be common to all clients.

thingersoft commented 9 months ago

@making Indeeed my first thought was to introduce the new property at an higher level.

Unfortunately the current codebase lacks infrastructure to define properties commons to all ModelClients. A few architectural changes was needed but being my first contribution to the project I didn't want to be too intrusive, so I falled back to an OpenAI dedicated property.

Anyway if a reviewer tells me it's ok I could open a new pull request with a more generic approach.

markpollack commented 8 months ago

We are looking into a more generic approach, perhaps based on the generic model, perhaps based on aspects.

0.9 is when I thought would be a good point to tackle this problem. In the meantime, we can expose a few retry options for what we have now as an initial step. spring.ai.openai.chat.retry.XYZ property, where the @ConfigurationProperty for retry would be reusable across multiple implementation eventually.

Also, in the meantime we can provide an additional constructor that takes a RetryTemplate and in the autoconfiguraiton we would pass in a default implementation if one wasn't provided in the application context already.

thingersoft commented 8 months ago

@markpollack Hello Mark, I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and RetryTemplate in particular relates to client read timeout configuration.

thingersoft commented 8 months ago

@markpollack Hello Mark, I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and RetryTemplate in particular relates to client read timeout configuration.

@markpollack I would be happy to contribute if you can tell me something about my quoted remark.

markpollack commented 4 months ago

Getting back to these issues now. Just saying that retry is in the same family of concerns around connectivty.

markpollack commented 4 months ago

Closing in favor of issue #512

I'd like to think that there is a general solution for this not related to AI. For example, an app is supposed to be able to talk to say 10 microservices, each with different timeouts.

asoftake commented 1 day ago

you can configure timeout settings easier with RestClientCustomizer.

  @Bean
  public RestClientCustomizer restClientCustomizer() {
      return restClientBuilder -> restClientBuilder
              .requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS
                      .withConnectTimeout(Duration.ofSeconds(1))
                      .withReadTimeout(Duration.ofSeconds(2))));
  }

This works not only for OpenAI but also for clients that use RestClient under the hood.

update Springboot3.4 remove okhttp3

spring-projects / spring-ai

Provide the ability to configure client timeouts #354