micronaut-projects / micronaut-core

Micronaut Application Framework
http://micronaut.io
Apache License 2.0
6.09k stars 1.07k forks source link

Streaming Declarative HTTP Client Throws ReadTimeoutException #4856

Open aero-don opened 3 years ago

aero-don commented 3 years ago

Task List

Steps to Reproduce

  1. git clone https://github.com/aero-don/FlowableClientExample.git
  2. git clone https://github.com/aero-don/FlowableExample.git
  3. cd FlowableClientExample
  4. ./gradlew run
  5. Verify first log info statement, DefaultHttpClientConfiguration.DEFAULT_READ_IDLE_TIMEOUT_MINUTES of 5 does not match documented default value of 60 seconds.
  6. cd FlowableExample (in another terminal)
  7. ./gradlew run
  8. Wait 5 minutes after FlowableClientExample starts receiving streamed data from server.
  9. Verify ReadTimeoutException thrown in FlowableClientExample.

Application Notes:

  1. In FlowableClientExample, uncomment micronaut.http.client.read-idle-timeout to set the value to 1 minute or some other value.
  2. In FlowableClientExample, the subscriber's onComplete method is called after the ReadTimeoutException is thrown. Logic in the FlowableClientExample will reconnect to the streaming server and the ReadTimeoutException will continue to be thrown at an interval consistent with the read-idle-timeout setting.
  3. In FlowableExample, millisecond delay between items streamed by the server are configured in application.yml with `sensor.measurement.rate.
  4. curl -N http://localhost:8080/sensors/measurements can be used to verify the server is streaming data at the configured rate.

Full Description

This issue has two parts regarding a streaming declarative HTTP client and the ReadTimeoutException: Part 1: Documentation Part 2: Behavior

In section "7.3.2 Streaming with @ Client" under subsection "Streaming Clients and Read Timeout":

  1. "Instead the read-idle-timeout setting (which defaults to 60 seconds) is used"; however, the value returned from DefaultHttpClientConfiguration.DEFAULT_READ_IDLE_TIMEOUT_MINUTES is 5 indicating the default value is 5 minutes. 5 minutes is consistent with the behavior of when the ReadTimeoutException is thrown.

  2. If the delay between items streamed from the server is longer than the default read-idle-timeout, then adjust readIdleTimeout in application.yml so the read-idle-timeout is longer that the delay between items streamed from the server. Regardless of the read-idle-timeout value, 60 seconds or 5 minutes, and the delay between items streamed from the server, 1 second or 1 millisecond, a ReadTimeoutException is thrown at a duration that corresponds with the read-idle-timeout setting even when items are being received by the client at a much faster rate.

Expected Behaviour

DefaultHttpClientConfiguration.DEFAULT_READ_IDLE_TIMEOUT_MINUTES should be 1 according to the documentation.

If the default read-idle-timeout setting or application configured read-idle-timeout is longer than the delay between items being streamed from the server, then the declarative HTTP client should not throw a ReadTimeoutException.

Actual Behaviour

DefaultHttpClientConfiguration.DEFAULT_READ_IDLE_TIMEOUT_MINUTES returns 5.

Regardless of the default read-idle-timeout setting or application configured read-idle-timeout setting, when the delay between items being streamed from the server is less the read-idle-timeout value and the declarative HTTP client stays connected to the server longer than read-idle-timeout setting the client throws a ReadTimeoutException.

Environment Information

Example Application

Stacktrace

io.reactivex.exceptions.UndeliverableException: The exception could not be delivered to the consumer because it has already canceled/disposed the flow or the exception has nowhere to go to begin with. Further reading: https://github.com/ReactiveX/RxJava/wiki/What's-different-in-2.0#error-handling | io.micronaut.http.client.exceptions.ReadTimeoutException: Read Timeout at io.reactivex.plugins.RxJavaPlugins.onError(RxJavaPlugins.java:367) at io.reactivex.internal.operators.flowable.FlowableCreate$BaseEmitter.onError(FlowableCreate.java:275) at io.micronaut.http.client.netty.DefaultHttpClient$10.userEventTriggered(DefaultHttpClient.java:1865) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at io.micronaut.http.client.netty.DefaultHttpClient$HttpClientInitializer$4.userEventTriggered(DefaultHttpClient.java:2911) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireUserEventTriggered(CombinedChannelDuplexHandler.java:430) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at io.netty.handler.codec.ByteToMessageDecoder.userEventTriggered(ByteToMessageDecoder.java:365) at io.netty.handler.codec.http.HttpObjectDecoder.userEventTriggered(HttpObjectDecoder.java:491) at io.netty.channel.CombinedChannelDuplexHandler.userEventTriggered(CombinedChannelDuplexHandler.java:241) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.handler.timeout.IdleStateHandler.channelIdle(IdleStateHandler.java:371) at io.netty.handler.timeout.IdleStateHandler$WriterIdleTimeoutTask.run(IdleStateHandler.java:539) at io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:476) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: io.micronaut.http.client.exceptions.ReadTimeoutException: Read Timeout at io.micronaut.http.client.exceptions.ReadTimeoutException.(ReadTimeoutException.java:26) ... 29 more Exception in thread "default-nioEventLoopGroup-1-13" io.reactivex.exceptions.UndeliverableException: The exception could not be delivered to the consumer because it has already canceled/disposed the flow or the exception has nowhere to go to begin with. Further reading: https://github.com/ReactiveX/RxJava/wiki/What's-different-in-2.0#error-handling | io.micronaut.http.client.exceptions.ReadTimeoutException: Read Timeout at io.reactivex.plugins.RxJavaPlugins.onError(RxJavaPlugins.java:367) at io.reactivex.internal.operators.flowable.FlowableCreate$BaseEmitter.onError(FlowableCreate.java:275) at io.micronaut.http.client.netty.DefaultHttpClient$10.userEventTriggered(DefaultHttpClient.java:1865) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at io.micronaut.http.client.netty.DefaultHttpClient$HttpClientInitializer$4.userEventTriggered(DefaultHttpClient.java:2911) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireUserEventTriggered(CombinedChannelDuplexHandler.java:430) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at io.netty.handler.codec.ByteToMessageDecoder.userEventTriggered(ByteToMessageDecoder.java:365) at io.netty.handler.codec.http.HttpObjectDecoder.userEventTriggered(HttpObjectDecoder.java:491) at io.netty.channel.CombinedChannelDuplexHandler.userEventTriggered(CombinedChannelDuplexHandler.java:241) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.handler.timeout.IdleStateHandler.channelIdle(IdleStateHandler.java:371) at io.netty.handler.timeout.IdleStateHandler$WriterIdleTimeoutTask.run(IdleStateHandler.java:539) at io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:476) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: io.micronaut.http.client.exceptions.ReadTimeoutException: Read Timeout at io.micronaut.http.client.exceptions.ReadTimeoutException.(ReadTimeoutException.java:26) ... 29 more

tbieniek commented 3 years ago

I have experienced the same issue, @aero-don have You found any workaround for this issue ?

aero-don commented 3 years ago

@tbieniek, @jameskleeh advised me to set micronaut.http.client.read-idle-timeout to -1 in application.yml. I verified that this works as a workaround.

aero-don commented 2 years ago

Environment Information Operating System: Ubuntu 20.04 Micronaut Version: 3.2.3 JDK Version: openjdk version "17.0.1" 2021-10-19 OpenJDK Runtime Environment GraalVM CE 21.3.0 (build 17.0.1+12-jvmci-21.3-b05) OpenJDK 64-Bit Server VM GraalVM CE 21.3.0 (build 17.0.1+12-jvmci-21.3-b05, mixed mode, sharing)

Example Application https://github.com/aero-don/reactive-controller.git

The behavior in Micronaut 3.2.x is slightly different than previously noted for 2.3.x. In Micronaut 3.2.x the client receives an onComplete event when the timeout occurs. The timeout occurs when the provider is continually publishing at a rate faster than the value for micronaut.http.client.read-idle-timeout.

I added some logging to io.netty.handler.timeout.IdleStateHandler to get some insight to the issue. The key log statements are:

10:41:43.465 [default-nioEventLoopGroup-1-3] WARN  i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run(): first=true, event=IdleStateEvent(WRITER_IDLE, first), nextDelay=-1429677
10:41:43.467 [default-nioEventLoopGroup-1-5] WARN  i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run(): first=true, event=IdleStateEvent(WRITER_IDLE, first), nextDelay=-2701235

Additional logging around those statements:

10:41:42.887 [default-nioEventLoopGroup-1-3] INFO  com.example.ReactiveControllerSpec - Received StringDTO[string=String Number: 3]
10:41:42.887 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelRead after ctx.fireChannelRead(msg): msg=PooledUnsafeDirectByteBuf(freed)
10:41:42.887 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete enter: readerIdleTimeNanos=5000000000, allIdleTimeNanos=5000000000, reading=true
10:41:42.887 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete before ctx.fireChannelReadComplete(): lastReadTime=2439837632385017, reading=false
10:41:42.887 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete after ctx.fireChannelReadComplete()
10:41:43.462 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - ReaderIdleTimeoutTask.run(): reading=false, lastReadTime=2439837632385017, nextDelay=4424607585
10:41:43.463 [default-nioEventLoopGroup-1-3] DEBUG i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run() lastWriteTime=2439833206744632
10:41:43.464 [default-nioEventLoopGroup-1-7] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete enter: readerIdleTimeNanos=300000000000, allIdleTimeNanos=300000000000, reading=false
10:41:43.465 [default-nioEventLoopGroup-1-7] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete before ctx.fireChannelReadComplete(): lastReadTime=2439833262775947, reading=false
10:41:43.465 [default-nioEventLoopGroup-1-7] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete after ctx.fireChannelReadComplete()
10:41:43.465 [default-nioEventLoopGroup-1-3] WARN  i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run(): first=true, event=IdleStateEvent(WRITER_IDLE, first), nextDelay=-1429677
10:41:43.467 [default-nioEventLoopGroup-1-5] DEBUG i.n.handler.timeout.IdleStateHandler - ReaderIdleTimeoutTask.run(): reading=false, lastReadTime=2439837631839337, nextDelay=4421047199
10:41:43.467 [default-nioEventLoopGroup-1-5] DEBUG i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run() lastWriteTime=2439833209630726
10:41:43.467 [default-nioEventLoopGroup-1-5] WARN  i.n.handler.timeout.IdleStateHandler - WriterIdleTimeoutTask.run(): first=true, event=IdleStateEvent(WRITER_IDLE, first), nextDelay=-2701235
10:41:43.467 [default-nioEventLoopGroup-1-6] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete enter: readerIdleTimeNanos=300000000000, allIdleTimeNanos=300000000000, reading=false
10:41:43.467 [default-nioEventLoopGroup-1-6] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete before ctx.fireChannelReadComplete(): lastReadTime=2439833262704124, reading=false
10:41:43.468 [default-nioEventLoopGroup-1-3] WARN  com.example.ReactiveControllerSpec - Received onComplete
10:41:43.468 [default-nioEventLoopGroup-1-6] DEBUG i.n.handler.timeout.IdleStateHandler - IdleStateHandler.channelReadComplete after ctx.fireChannelReadComplete()

The issue is not the ReaderIdleTimeoutTask timing out. The issue is the WriterIdleTimeoutTask timing out.

In io.micronaut.http.client.netty.DefaultHttpHandler, at or around line 3184 the readIdleTimeout is retrieved as duration. The duration is passed to the construction of IdleStateHandler for the readIdleTime, writeIdleTime and allIdleTime. NOTE: This pattern of using the readIdleTimeout as writeIdleTime and allIdleTime also occurs at or around line 908 in io.micronaut.http.client.netty.DefaultHttpHandler, but does not affect this use case.

In the case of a controller method returning a Flux, the data flow is in one direction after initial setup and the WriterIdleTimeoutTask will eventually time out in the time set for micronaut.http.client.read-idle-timeout.

Possible solutions:

  1. Pass -1 to for the writeIdleTime. I have tested this with an application running for several days with no timeout occurring. When the writeIdleTime is negative or zero, Netty does not schedule the WriterIdleTimeoutTask.
  2. Add micronaut.http.client.write-idle-timeout configuration parameter and allow the user control of the time out value for write idle timeout.