Closed bcelenk closed 5 years ago
In this issue, netty debug logs are shared for this particular problem. Which clearly shows that the channel between gateway-origin is closed before the request[3rd one in my case] is delivered to the origin.
My guess is that it is a reactor-netty issue. @smaldini @violetagg
Agree I dont think its a netty issue.
We are experiencing the same issue. After setting http client pool type to fixed from elastic as the default, it seems we do not have any connection premature exceptions anymore. Any comments? @bcelenk @spencergibb
@yesilalperen @spencergibb @smaldini @violetagg @simonbasle According to my tests at lab environment I can confirm that setting the gateway's client pool setting from ELASTIC to FIXED solved the "Connection Closed Prematurely" error. Also https://github.com/spring-cloud/spring-cloud-gateway/issues/480, I don't see the "SSLEngine already closed" exception anymore. I think there is a severe bug in reactor-netty's https://projectreactor.io/docs/netty/release/api/reactor/ipc/netty/resources/PoolResources.html#elastic-java.lang.String-
We are also having this issue. We use spring boot and WebFlux 5.0.8. We always get the IOException: Connection Closed Prematurely after 10 minutes without any access. Since we don't really want to use a fixed connection pool, the workaround we are using is simply adding a retry after having the issue. And we are waiting for the fix.
Do the connections in the invalid state ever disappear? If they don't, would the retry approach imply that the pool eventually becomes littered with bad state connections, forcing retries on N% of your requests?
Wouldn't a very large fixed pool be the same as elastic?
Hi,
We think we have a fix for this issue. It is available for the moment in Reactor Netty 0.7.10.BUILD-SNAPSHOT/0.8.1.BUILD-SNAPSHOT. It will be great if you are able to test it and provide feedback.
Thanks, Violeta
We think we've been seeing some unfortunate interplay between this and cloudfoundry/gorouter#212.
It looks like failed requests to spring-cloud-gateway are triggering an event in the gorouter that is causing it to drop all connections, and causing our upstream AWS ALB to mark the gorouter instance as down, effectively bringing down our entire PCF instance.
@cah-andrewfitzgerald were you able to test with the snapshots @violetagg mentioned?
Not yet, need to dig in this week and set up a harness to replicate it (only seeing it intermittently).
@spencergibb another thing to note with this error:
We were watching metrics closely for the gateway during our rollout, and it looked like everything was working fine as shown by the gateway.requests
metrics.
I can more detailed issue later.
We are experiencing the same issue. After setting http client pool type to fixed from elastic as the default, it seems we do not have any connection premature exceptions anymore. Any comments? @bcelenk @spencergibb
How do you change the http client pool type for netty started as part of spring boot.
Hello,
We have additional fix available in Reactor Netty 0.7.11.RELEASE. Can you try that and provide a feedback?
Thanks, Violeta
When my file-upload-service throw MaxUploadSizeExceededException,spring cloud gateway will throw java.io.IOException: Connection closed prematurely too!
@zhangam Can you provide some simple example? Thanks a lot.
@zhangam Can you provide some simple example? Thanks a lot.
To show this problem,I turn Spring Cloud Gateway Hystrix off first. Then zsy-ufile-service microservices set:
spring:
servlet:
multipart:
max-file-size: 20MB
max-request-size: 20MB
Then i uploaded a file that exceeded the limit. This is my zsy-ufile-service exception:
org.springframework.web.multipart.MaxUploadSizeExceededException: Maximum upload size exceeded; nested exception is java.lang.IllegalStateException: org.apache.tomcat.util.http.fileupload.FileUploadBase$SizeLimitExceededException: the request was rejected because its size (53687355) exceeds the configured maximum (20971520) at......
And here's a picture of the Spring Cloud Gateway exception:
@violetagg Are you looking at this question, please?
@zhangam Thanks for the described scenario.
In this scenario the Connection Closed Prematurely
is something that is expected as the origin (Tomcat server) closes the connection on purpose (the file exceeds the limit) and that happens while you are sending the request body.
However I extended the exception message so that is it clear the the remote party closed the connection while sending the request body. (https://github.com/reactor/reactor-netty/commit/b576b8761e61ebaaf0f885ff3088c9179ea2ac6e)
The latest fix is available with 0.7.13.BUILD-SNAPSHOT
/0.8.4.BUILD-SNAPSHOT
All: Please try that versions and if the issue is gone I think we can close the ticket.
@yesilalperen @spencergibb @smaldini @violetagg @simonbasle According to my tests at lab environment I can confirm that setting the gateway's client pool setting from ELASTIC to FIXED solved the "Connection Closed Prematurely" error. Also #480, I don't see the "SSLEngine already closed" exception anymore. I think there is a severe bug in reactor-netty's https://projectreactor.io/docs/netty/release/api/reactor/ipc/netty/resources/PoolResources.html#elastic-java.lang.String-
I set the gateway's client pool setting from ELASTIC to FIXED ,but can not solved the "Connection Closed Prematurely" error. That is way?
The latest fix is available with
0.7.13.BUILD-SNAPSHOT
/0.8.4.BUILD-SNAPSHOT
All: Please try that versions and if the issue is gone I think we can close the ticket.
Can you tell me which version of spring cloud gaateway? Thanks
@gxl191911 You can update Reactor Netty version in your application like this:
pom.xml Spring Gateway 2.1.0.x (current 2.1.0.RC3)
<properties>
<reactor-bom.version>Californium-SR4</reactor-bom.version>
</properties>
Spring Gateway 2.0.x (current 2.0.2.RELEASE)
<properties>
<reactor-bom.version>Bismuth-SR15</reactor-bom.version>
</properties>
With the updated versions of reactor-netty is this still an issue?
@spencergibb @violetagg Yes, we are still seeing this with Bismuth-SR15.
java.io.IOException: Connection closed prematurely
at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:278)
at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282)
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1050)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:315)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
at java.lang.Thread.run(Thread.java:748)
@tony-clarke-amdocs can you open an issue in reactor netty?
[2019-03-26 18:14:59.216] [reactor-http-epoll-8] [ERROR] [reactor.core.publisher.Operators] [Operator called default onErrorDropped] reactor.core.Exceptions$BubblingException: reactor.netty.http.client.PrematureCloseException: Connection prematurely closed DURING response at reactor.core.Exceptions.bubble(Exceptions.java:154) at reactor.core.publisher.Operators.onErrorDropped(Operators.java:512) at reactor.netty.channel.FluxReceive.onInboundError(FluxReceive.java:343) at reactor.netty.channel.ChannelOperations.onInboundError(ChannelOperations.java:398) at reactor.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:258) at reactor.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:121) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390) at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355) at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:826) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:333) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) at java.lang.Thread.run(Thread.java:748) Caused by: reactor.netty.http.client.PrematureCloseException: Connection prematurely closed DURING response [2019-03-26 18:14:59.217] [reactor-http-epoll-8] [ERROR] [reactor.core.publisher.Operators] [Operator called default onErrorDropped] reactor.core.Exceptions$BubblingException: reactor.netty.http.client.PrematureCloseException: Connection prematurely closed DURING response at reactor.core.Exceptions.bubble(Exceptions.java:154) at reactor.core.publisher.Operators.onErrorDropped(Operators.java:512) at reactor.netty.channel.FluxReceive.onInboundError(FluxReceive.java:343) at reactor.netty.channel.ChannelOperations.onInboundError(ChannelOperations.java:398) at reactor.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:258) at reactor.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:121) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390) at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355) at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:826) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:333) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) at java.lang.Thread.run(Thread.java:748) Caused by: reactor.netty.http.client.PrematureCloseException: Connection prematurely closed DURING response
We are also having this issue. We use spring boot and WebFlux 5.0.8. We always get the IOException: Connection Closed Prematurely after 10 minutes without any access. Since we don't really want to use a fixed connection pool, the workaround we are using is simply adding a retry after having the issue. And we are waiting for the fix.
We still seeing the same issue after idle for like 20 mins, for POST request the error is "Connection Closed Prematurely", for GET request there is no error but return empty response. Then we send a lot of request, the responses are back to normal in 5s. It seems like the connection was somehow broken while idle but spring cloud gateway doesn't know. When new requests come in, it drops the broken connection because of error, then establish the new connections to make it work. The reactor netty version is 0.8.9 release, the gateway version is 2.1.1 Release. @violetagg we saw you were working on the issue, is there any input on this one? Thank you very much.
@zhanglin12345 Did you try to investigate why the other party closes the connection? Any tcp dump?
@zhanglin12345 Did you try to investigate why the other party closes the connection? Any tcp dump?
@violetagg thanks for asking! We have a spring cloud gateway service, and a webflux down stream service called "order service". We ran the jar file of "gateway service" in one host, and the jar file of "order service" in another host, it works fine, no matter how much time we were waiting, and how many requests we sent.
Then we deployed them in docker containers (base image is "openjdk:11-jdk-slim"), first we sent a lot of requests, then idle for like 20 mins, then we got the error. We changed the tcp keep alive interval to 6s, and compared the tcp dump between "JAR only" and the docker container We found tcp keep alive package is missing in container, it is because the container doesn't inherit the network kernel parameters from the host. And we didn't see FIN in 20 mins at all, we got a lot of "RST" afterwards. Then we changed the keep alive interval in docker which is not recommended, but the response is so slow (5s) we are not able to reproduce the bug. Still investigating the problem, do you have any clue on this? Thanks very much!
i foud this error when i used springcloud Greenwich.SR3 . The scene is a global filter i builded to filter some illegal characters like this:
@Override
public Mono
ServerHttpRequestDecorator serverHttpRequestDecorator = new ServerHttpRequestDecorator(request) {
@Override
public Flux<DataBuffer> getBody() {
Flux<DataBuffer> body = super.getBody();
return body.map(dataBuffer -> {
byte[] content = new byte[dataBuffer.readableByteCount()];
dataBuffer.read(content);
DataBufferUtils.release(dataBuffer);
String bodyJson = new String(content, Charset.forName("UTF-8"));
if (bodyJson.equals("")) {
errorResult(response);
}
JSONObject jsonObject = JSON.parseObject(bodyJson);
String result = jsonObject.toJSONString();
byte[] bytes = result.getBytes();
NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
buffer.write(bytes);
return buffer;
});
}
@Override
public HttpHeaders getHeaders() {
HttpHeaders httpHeaders = new HttpHeaders();
httpHeaders.putAll(super.getHeaders());
return httpHeaders;
}
};
return chain.filter(exchange.mutate().request(serverHttpRequestDecorator).build());
}
of course,above code is likely a demo i just write for testing ,and i just builded one service for testing. during testing at ,first processing succesful,and the second time the error appear, again requesting it success,and the four time the error happened again. How i can solved thie error .
Hi,
I've come across with the Connection closed prematurely problem, reproducing is trivial. Below is the stack trace.
I'm using raw cloud-gateway and webflux which are both produced by Spring Initializr/no modifications, just added a uuid to represent each user's session.
Load is generated using Gatling, 2000rps. Below is the gatling's scala file.
After 11k requests, exception occurred. This number varies between tests.
Gateway/reactor-netty's context when exception is occurred:
Origin is unaware about exception
Wireshark HTTP packages, user's first two request were delegated to origin, 3rd one wasn't forwarded.
Last http package's details
Tcp stream of the last http package(produces using right-click>follow TCP stream):
All apps are run on my personal macbook, same exception also occurs on EC2/Linux per gateway/origin with genuine clients.
Macbook specs: macOS High Sierra 10.13.6 (17G65) Processor 2.2 GHz Intel Core i7 Memory 16 GB 1600 MHz DDR3 Tried with 3 different jdks: jdk1.8.0_131 jdk1.8.0_171 jdk1.8.0_181