aws / aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Apache License 2.0
2.21k stars 853 forks source link

Kinesis Async Client hangs after java.io.IOException: Response had content-length of 28 bytes, but only received 0 bytes before the connection was closed #4354

Open MichalZalewskiRASP opened 1 year ago

MichalZalewskiRASP commented 1 year ago

Describe the bug

I get the same defect as described here: https://github.com/aws/aws-sdk-java-v2/issues/3335 when I use the SubscribeToShard API. It hungs and no other visitor.visit() is executed after the Kinesis Async Client throws java.io.IOException: Response had content-length of 28 bytes, but only received 0 bytes before the connection was closed. Basically, stream consumption is halted afterwards regardless of the fact that I use the onError handler in the request.

Expected Behavior

The SubscribeToShard API should be able to resume stream consumption after encountering IOException.

Current Behavior

The thread hangs and the client does not continue to consume the stream.

Reproduction Steps

It is difficult to reproduce as it happens once per week randomly when the message length does not pass the validation. My dependencies:

<dependency>
            <groupId>software.amazon.kinesis</groupId>
            <artifactId>amazon-kinesis-client</artifactId>
            <version>2.5.2</version>
</dependency>
 <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>url-connection-client</artifactId>
            <version>2.20.123</version>
            <scope>test</scope>
</dependency>
 <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>sts</artifactId>
            <version>2.20.123</version>
  </dependency>

Here is my setup.

I call (in Kotlin):

kinesisClient.subscribeToShard(requestToSubscribe, responseHandler).get()

whereby:

Possible Solution

No response

Additional Information/Context

Stack Trace: java.io.IOException: Response had content-length of 28 bytes, but only received 0 bytes before the connection was closed.

software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.validateResponseContentLength(ResponseHandler.java:163), software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$700(ResponseHandler.java:75), software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:369), software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.complete(HandlerPublisher.java:447), software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.channelInactive(HandlerPublisher.java:430), io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:303), io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:281), io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:274), io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81), io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277), io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:303), io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:281), io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:274), io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405), io.netty.channel.AbstractChannelHandlerContext

AWS Java SDK version used

2.20.43 / kcl 2.5.2

JDK version used

11.0.9

Operating System and version

linux (different versions) x86_64

debora-ito commented 1 year ago

@MichalZalewskiRASP acknowledged.

We will investigate, but, like you mentioned, we also find this is hard to reproduce. If you or anyone is experiencing this issue and can reproduce it reliably, please send us a repro code.

debora-ito commented 1 year ago

@MichalZalewskiRASP

A change (#4402) was released in SDK version 2.20.144, please try it out and let us know if you still see the exception.

MichalZalewskiRASP commented 1 year ago

Hi @debora-ito I have just deployed the application with the new dependency. I need a few days to check as this bug happens once per few days. I will get back. Thank you for taking care of my request.

antovespoli commented 1 year ago

Hi @debora-ito, I validated the fix and it has been holding correctly since applying it a few weeks back. This is to confirm that the fix is effective.