reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty
https://projectreactor.io
Apache License 2.0
2.6k stars 645 forks source link

Connection Closed Prematurely #413

Closed bcelenk closed 5 years ago

bcelenk commented 6 years ago

Hi,

I've come across with the Connection closed prematurely problem, reproducing is trivial. Below is the stack trace.

java.io.IOException: Connection closed prematurely
    at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:269) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:113) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171]

I'm using raw cloud-gateway and webflux which are both produced by Spring Initializr/no modifications, just added a uuid to represent each user's session.

Load is generated using Gatling, 2000rps. Below is the gatling's scala file.


import io.gatling.core.Predef._
import io.gatling.http.Predef._
import io.gatling.http.request.builder.HttpRequestBuilder.toActionBuilder
import scala.concurrent.duration._
import java.util.UUID

class GatewayLoadTest extends Simulation {
  object UuidFeeder {
    val feeder = Iterator.continually(Map("uuid" -> java.util.UUID.randomUUID.toString()))
  }

  val theHttpProtocolBuilder = http
        .baseURL("http://localhost:8080")

  val theScenarioBuilder = scenario("sample")
  .feed(UuidFeeder.feeder)
  .forever(){
     exec(http("gateway")
      .get("/premature/${uuid}")
      .header("gatling-user-id", "${uuid}")
      .check(

                    status.find.in(200)))
    }

    setUp(
        theScenarioBuilder.inject(constantUsersPerSec(100) during (120 seconds))
    )
    .throttle(
            reachRps(2000) in (1 seconds),
            holdFor(120 seconds)
        )
      .protocols(theHttpProtocolBuilder)

}

After 11k requests, exception occurred. This number varies between tests.

Gateway/reactor-netty's context when exception is occurred: gateway-premature-exception

Origin is unaware about exception origin

Wireshark HTTP packages, user's first two request were delegated to origin, 3rd one wasn't forwarded. premature-wireshark

Last http package's details

Frame 275732: 210 bytes on wire (1680 bits), 210 bytes captured (1680 bits) on interface 0
    Interface id: 0 (lo0)
        Interface name: lo0
    Encapsulation type: NULL/Loopback (15)
    Arrival Time: Aug  2, 2018 16:43:41.578082000 +03
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1533217421.578082000 seconds
    [Time delta from previous captured frame: 0.000009000 seconds]
    [Time delta from previous displayed frame: 0.345598000 seconds]
    [Time since reference or first frame: 233.942944000 seconds]
    Frame Number: 275732
    Frame Length: 210 bytes (1680 bits)
    Capture Length: 210 bytes (1680 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: null:ip:tcp:http]
    [Coloring Rule Name: HTTP]
    [Coloring Rule String: http || tcp.port == 80 || http2]
Null/Loopback
    Family: IP (2)
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 206
    Identification: 0x0000 (0)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source: 127.0.0.1
    Destination: 127.0.0.1
    [Source GeoIP: Unknown]
    [Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 57374, Dst Port: 8080, Seq: 309, Ack: 299, Len: 154
    Source Port: 57374
    Destination Port: 8080
    [Stream index: 6157]
    [TCP Segment Len: 154]
    Sequence number: 309    (relative sequence number)
    [Next sequence number: 463    (relative sequence number)]
    Acknowledgment number: 299    (relative ack number)
    1000 .... = Header Length: 32 bytes (8)
    Flags: 0x018 (PSH, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 1... = Push: Set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
        [TCP Flags: ·······AP···]
    Window size value: 12750
    [Calculated window size: 408000]
    [Window size scaling factor: 32]
    Checksum: 0xfec2 [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - Timestamps: TSval 458370685, TSecr 458370364
            Kind: Time Stamp Option (8)
            Length: 10
            Timestamp value: 458370685
            Timestamp echo reply: 458370364
    [SEQ/ACK analysis]
        [iRTT: 0.000137000 seconds]
        [Bytes in flight: 154]
        [Bytes sent since last PSH flag: 154]
    TCP payload (154 bytes)
Hypertext Transfer Protocol
    GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n]
            [GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: GET
        Request URI: /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1
        Request Version: HTTP/1.1
    gatling-user-id: 43d91f31-5fc7-4c65-81eb-84d9865c2ff1\r\n
    Host: localhost:8080\r\n
    Accept: */*\r\n
    \r\n
    [Full request URI: http://localhost:8080/premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1]
    [HTTP request 3/3]
    [Prev request in frame: 270045]

Tcp stream of the last http package(produces using right-click>follow TCP stream): wireshark-as-tcp-stream

All apps are run on my personal macbook, same exception also occurs on EC2/Linux per gateway/origin with genuine clients.

Macbook specs: macOS High Sierra 10.13.6 (17G65) Processor 2.2 GHz Intel Core i7 Memory 16 GB 1600 MHz DDR3 Tried with 3 different jdks: jdk1.8.0_131 jdk1.8.0_171 jdk1.8.0_181

This issue is directly addressed to reactor-netty.

violetagg commented 6 years ago

@bcelenk Can you try different settings for SO_BACKLOG? The default is 1000, so try to increase/decrease this and tell us the result.

Customise it with the code below

@Component
public class MyNettyReactiveWebServerCustomizer
                implements WebServerFactoryCustomizer<NettyReactiveWebServerFactory> {
    @Override
    public void customize(NettyReactiveWebServerFactory factory) {
        factory.addServerCustomizers(
                        builder -> builder.option(ChannelOption.SO_BACKLOG, 2000));
    }
}
violetagg commented 6 years ago

@bcelenk were you able to play with the TCP backlog configuration. I was thinking whether this is something like this one here https://github.com/spring-cloud/spring-cloud-gateway/issues/228#issuecomment-399970521 I tried to reproduce it on my local machine but the results are a bit different than yours

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     238668 (OK=235894 KO=2774  )
> min response time                                      0 (OK=0      KO=0     )
> max response time                                   1023 (OK=1023   KO=0     )
> mean response time                                     8 (OK=8      KO=0     )
> std deviation                                         23 (OK=23     KO=0     )
> response time 50th percentile                          4 (OK=4      KO=0     )
> response time 75th percentile                          5 (OK=5      KO=0     )
> response time 95th percentile                         14 (OK=14     KO=0     )
> response time 99th percentile                        106 (OK=107    KO=0     )
> mean requests/sec                                1956.295 (OK=1933.557 KO=22.738)
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                        235884 ( 99%)
> 800 ms < t < 1200 ms                                  10 (  0%)
> t > 1200 ms                                            0 (  0%)
> failed                                              2774 (  1%)
---- Errors --------------------------------------------------------------------
> j.n.ConnectException: Failed to open a socket.                   2624 (94.59%)
> j.n.ConnectException: connection timed out: localhost/127.0.0.    150 ( 5.41%)
1:8080
================================================================================
bcelenk commented 6 years ago

Hi @violetagg

Sorry about the delay, I'm on vacation. While investigating the related issues, I had seen that particular issue and changed origin to use Tomcat and set the maxKeepAliveRequests to -1 in the past, which resulted in success until saturation of file descriptors, just like in your case. This is the expected behaviour since nothing goes wrong until it hits the limit. I've increased the backlog of the origin and give it a shot (by the way there are lots of warning logs Unknown channel option 'SO_BACKLOG' for channel ... probably WebServerFactoryCustomizer doesn't consider the channel type, whether it's child or parent), couldn't see the prematurely error, to be honest I think both modifying backlog and tomcat options are deceptions (I'll share another reproducer at the end of the message which is still causing exceptions with modified backlog value).

There are various reasons to make me think like that.

First of all, we are using the cloud gateway/reactor-netty to face partial load, which are way less than it's capabilities. 85k per hour precisely, and even with low traffic at this rate, we're intermittently observing Connection closed prematurely. and SSLEngine closed already.(https://github.com/spring-cloud/spring-cloud-gateway/issues/480) exceptions (sometimes SSLEngine exception occurs for consecutive requests ~150 or more).

Second reason is that we're using our cloud provider's load balancer before our origin, which is proprietary software and couldn't fine tune these kind of backlog/connection settings. At the end of the day, a person might want to send requests to external systems using pooled TcpClient, where these kind of solutions are not applicable.

What I'm pretty sure is that something is wrong with pooled TcpClient. At first, I thought that something is wrong with the elastic pool. However, could reproduce are same exceptions with fixed pool with high connection limit(which converges to elastic pool). So I come up with a reproducer to pinpoint the problem.

Basically, this reproducer creates many channels in the pool, then restarts the origin. Which makes all of these channels in the pool invalid/useless. Then trigger another requests wave to the gateway, results with the mentioned exceptions.

First of all, let's add a delay to the origin's handler method, hence TcpClient's elastic pool will create a new channel for each request to deal with the load since all existing channels are acquired for respective second. Example handler method:

    @RequestMapping("/premature/{userId}")
    @ResponseStatus(HttpStatus.OK)
    public Flux<Long> premature() {
        return Flux.just(1L).delayElements(Duration.ofSeconds(1L));
    }

Using the lsof, monitor the count of the open connections between gateway<->origin. 7777 is the origin's port in this case. repeat 10000 lsof -i tcp:7777 | wc -l

Then create channels in the pool using the below command with wrk2 wrk2 -H "gatling-request-id: test" -t8 -c1000 -d10s -R1 http://localhost:9009/premature/test

There shouldn't be any exceptions until now. And there are plenty of connections as the result of the lsof

pool-has-channels

Then restart the origin, which will close the connections by sending FIN packet.

pool-empty

Because of the nature of the TCP, reactor-netty client's pool still has these invalidate channels. However, this invalid channels will be vanished thanks to the connection pool's health checker while trying to acquire the channel, hence it should not be a problem. But exception occurs. If I enable SSL, it'd have thrown the mentioned SSL exception. So something's wrong here.

2018-08-21 16:17:47.396 ERROR 77211 --- [ctor-http-nio-8] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

java.io.IOException: Connection closed prematurely
    at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:269) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:113) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
violetagg commented 6 years ago

@bcelenk Your last example was extremely helpful. Thanks for that. The issue is that when we drain the last element from the queue that holds the request messages (sent by the client)/response messages (sent by the server) we should not schedule the flush operation but invoke it immediately.

the corresponding logs

2018-08-22 17:06:03.978 DEBUG 39805 --- [ctor-http-nio-5] r.i.n.channel.ChannelOperationsHandler   : [id: 0xe05650db, L:/127.0.0.1:52103 - R:localhost/127.0.0.1:8060] Writing object DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
GET /premature/test HTTP/1.1
gatling-request-id: test
Forwarded: proto=http;host="localhost:8080";for="0:0:0:0:0:0:0:1:52020"
X-Forwarded-For: 0:0:0:0:0:0:0:1
X-Forwarded-Proto: http
X-Forwarded-Port: 8080
X-Forwarded-Host: localhost:8080
host: localhost:8060
...
2018-08-22 17:06:03.979 DEBUG 39805 --- [ctor-http-nio-5] r.i.n.resources.DefaultPoolResources     : [id: 0xe05650db, L:/127.0.0.1:52103 ! R:localhost/127.0.0.1:8060] Channel released, now 137 active connections
...
2018-08-22 17:06:03.983 DEBUG 39805 --- [ctor-http-nio-5] r.ipc.netty.channel.ChannelOperations    : [id: 0xe05650db, L:/127.0.0.1:52103 ! R:localhost/127.0.0.1:8060] An outbound error could not be processed

java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_141]
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_141]
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_141]
    at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_141]
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_141]
    at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:403) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:360) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1396) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.logging.LoggingHandler.flush(LoggingHandler.java:265) [netty-handler-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.flush(CombinedChannelDuplexHandler.java:533) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.flush(CombinedChannelDuplexHandler.java:358) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.lambda$scheduleFlush$0(ChannelOperationsHandler.java:335) [reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]

Do you think you can grab #419 and test the change?

bcelenk commented 6 years ago

@violetagg Thank you for spotting that. Unfortunately, I'm still experiencing the same exception while trying the last example. I've cherry-picked #419 onto 0.7.x branch(also tried with 0.7.8 RELEASE as the base) and applied it to both gateway/origin. If in your case the error is not reproducible, my humble suggestion would be to enable SSL while connecting to the origin. In that case, both Connection closed prematurely. and SSLEngine closed already. occur and error ratio is ~%100 in my case. I have a hunch that both exceptions are related. Below is an example log excerpt.

2018-08-22 23:41:39.793 ERROR 35017 --- [ctor-http-nio-4] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

java.io.IOException: Connection closed prematurely
    at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:271) ~[reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115) ~[reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1028) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]

2018-08-22 23:41:39.913 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
    ... 10 common frames omitted

2018-08-22 23:41:39.986 ERROR 35017 --- [ctor-http-nio-6] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
    ... 10 common frames omitted

2018-08-22 23:41:40.068 ERROR 35017 --- [ctor-http-nio-5] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.068 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.069 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.073 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.074 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.085 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.092 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.095 ERROR 35017 --- [ctor-http-nio-6] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
    ... 10 common frames omitted

2018-08-22 23:41:40.096 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.225 ERROR 35017 --- [ctor-http-nio-8] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
    ... 10 common frames omitted

2018-08-22 23:41:41.050 ERROR 35017 --- [ctor-http-nio-2] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.083 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.089 ERROR 35017 --- [ctor-http-nio-2] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.138 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.138 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.146 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.146 ERROR 35017 --- [ctor-http-nio-4] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
dave-fl commented 6 years ago

Hi,

I've seen some mentions of problems related to pooling - Does this fall into the same category?

violetagg commented 6 years ago

@dave-fl The problem that I described above so related to the pooling yes.

dave-fl commented 6 years ago

@violetagg Will the use of netty-tcnative help here?

mulderbaba commented 6 years ago

@dave-fl It won't. I think the reason behind here would be the pool management itself. I tried @bcelenk 's sample as well and if the origin restarts itself, the gateway doesn't do a proper health check on the previously opened socket connections while trying the re-use them during the handshaking with the new origin. @violetagg do you have any comments on this one? if you can shed some light on this class-wise I can also take a look at it deeply.

dave-fl commented 6 years ago

@mulderbaba

At any point do the connections move out of the invalid state and become useable again eg can this be worked around through retry?

I’ve also read that a fixed pool can address this (not sure why).

violetagg commented 6 years ago

Hey,

I added additional debugging to the 0.7.9.BUILD-SNAPSHOT. Please execute the following:

Thanks a lot, Violeta

mulderbaba commented 6 years ago

Hey @violetagg,

For both origin implementations I experienced SSLEngine closed already exceptions and the count of inactive connections weren't 0. I'm attaching log output of my gateway for both executions; X, Y.

Wireshark captures are from the origin machine.

violetagg commented 6 years ago

@mulderbaba There was one use case where we missed to decrement the inactive counter (this is when the connection is created but cannot be acquired because of some exceptions in my case it was SocketException: Connection reset by peer). I fixed that so can you try with the new snapshot again. Can you enable one additional log logging.level.reactor.ipc.netty.http.client.HttpClientOperations=DEBUG thus when SSLEngine closed already happens we will know the port number and then we can find the records in the tcpdump. On my end I see always the server initiate the connection close (when SSLEngine closed already happens), when the origin is stopped all the connections are closed. When I start the origin again I can see that new connections are created.

2018-08-31 08:17:07.176 DEBUG 8785 --- [ctor-http-nio-3] r.i.n.resources.DefaultPoolResources     : [id: 0x3f7cffa5, L:/127.0.0.1:59578 ! R:localhost/127.0.0.1:8060] Channel closed, now 0 active connections and 0 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-1] r.i.n.resources.DefaultPoolResources     : [id: 0xe87e160d] Created new pooled channel, now 0 active connections and 5 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-5] r.i.n.resources.DefaultPoolResources     : [id: 0x275ebcb8] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-7] r.i.n.resources.DefaultPoolResources     : [id: 0x236088be] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-8] r.i.n.resources.DefaultPoolResources     : [id: 0x3d488f83] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-6] r.i.n.resources.DefaultPoolResources     : [id: 0x8ec97034] Created new pooled channel, now 0 active connections and 4 inactive connections
mulderbaba commented 6 years ago

@violetagg I made 2 runs with reactor-netty based gateway and origin. Using the latest snapshot of 0.7.x.

First run contains all logs with debug mode. Here are the logs of gateway, logs of origin and tcpdump of origin: gateway_log_01.log gateway_log_02.log origin_log.log origin_tcpdump.pcapng.zip

Second run only contains HttpClientOperations=DEBUG set. Here are the logs of gateway and tcpdump of origin: gateway_simplified.log wireshark-short-last.pcapng.zip

JCASTANO commented 6 years ago

@violetagg Good afternoon, some possible date of solution for this problem? We are starting a project with spring webflux and we are a little concerned about the problem because we have a large volume of users, greetings and many thanks

asafbruner commented 6 years ago

Hi @violetagg, I join the comment above - we are also suffering from this issue and would love to know if there is progress with this issue. Thanks.

fabianedl777 commented 6 years ago

Hi @violetagg, good afternoon:

I am working on a project with webflux and we are using WebClient to make HTTP requests to external services, in this moment we have the problem described in this ticket when we make 100 concurrent requests.

Perform the following tests, at the time of creating the WebClient create a poolResource of type elastic and this generates me the error

this.webClient = webClientBuilder.clientConnector (new ReactorClientHttpConnector (options -> {             options.poolResources (PoolResources.elastic ("httpPool")). compression (true);         })). build ();

java.io.IOException: Connection closed prematurely

Additionally I also create the WebClient with fixed type poolResource and everything works fine for me.

this.webClient = webClientBuilder.clientConnector (new ReactorClientHttpConnector (options -> {             options.poolResources (PoolResources.fixed ("httpPool")). compression (true);         })). build ();

I would like to know if it is correct to use poolResource fixed and what implications it has to use fixed with respect to elastic.

Thank you very much.

daemonsoft commented 6 years ago

Hi @violetagg I just want to join the coments above about a possible solution or workaround for this issue. Thanks in advance.

vpavlyuk commented 6 years ago

Hi @violetagg Is there any possible solution or a configuration workaround to try? We are seeing this issue every time we attempt to put spring cloud gateway in front of one of our higher traffic services (which has occasional 10ms long usage bursts of 100-500 requests within the 10ms, but otherwise serves about 1000-2000 req/min). The "Connection closed prematurely" exception shows up with a somewhat stable cadence of 1 event per 3-7 minutes, never closer than 3 minutes apart, without a direct timing correlation to the usage bursts. Switching to fixed with a huge acquireTimeout didn't help.

ssko1 commented 6 years ago

@vpavlyuk can you see if adding a retry

.retry(e -> e instanceof IOException)` //or some other predicate

helps? It seems to be an acceptable workaround.

violetagg commented 6 years ago

Hi,

Are you able to try this fix #446 and tell us the result.

Thanks, Violeta

dave-fl commented 6 years ago

@violetagg Is 446 only for 0.7.x or 0.8.x too?

violetagg commented 6 years ago

@dave-fl the commit is now in 0.7.10.BUILD-SNAPSHOT and 0.8.1.BUILD-SNAPSHOT

violetagg commented 6 years ago

Hi,

Can you provide feedback for the fix?

Thanks, Violeta

bcelenk commented 6 years ago

Hi @violetagg The fix that you've sent fixed the exception, now the exception is thrown as connection reset by peer. The problem that I've trying the solve is the resets itself, not sure it's related to reactor-netty or not.

violetagg commented 6 years ago

@bcelenk And the Connection reset by peer is it IOException or SocketException? What Tcp dump shows?

bcelenk commented 6 years ago

dump shows server sends the RST, below is the exception:

java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_131]
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_131]
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_131]
    at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_131]
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_131]
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288) ~[netty-buffer-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1108) ~[netty-buffer-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) ~[netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:628) [netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:563) [netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) [netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) [netty-transport-4.1.29.Final.jar:4.1.29.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) [netty-common-4.1.29.Final.jar:4.1.29.Final]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
smaldini commented 6 years ago

Can you please evaluate if this fixes the issue https://github.com/reactor/reactor-netty/pull/483 - we believe it will so feel free to re-open the issue if you reproduce still on the 0.7.11 coming out today.

sam701 commented 6 years ago

Unfortunately, I have exactly the same issue with 0.7.11.RELEASE

java.io.IOException: Connection closed prematurely
    at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:272) ~[reactor-netty-0.7.11.RELEASE.jar!/:0.7.11.RELEASE]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115) ~[reactor-netty-0.7.11.RELEASE.jar!/:0.7.11.RELEASE]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final]
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:315) ~[netty-transport-native-epoll-4.1.29.Final-linux-x86_64.jar!/:4.1.29.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final]
    at java.base/java.lang.Thread.run(Thread.java:844) ~[na:na]

@smaldini @bcelenk please reopen.

violetagg commented 6 years ago

@sam701 Please provide more information for the use case? We do not see this issue with the reproducible scenario provided by @bcelenk

sam701 commented 6 years ago

@violetagg Here is the setup:

import lombok.extern.slf4j.Slf4j; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean; import org.springframework.core.io.buffer.DataBuffer; import org.springframework.http.HttpStatus; import org.springframework.web.reactive.function.client.ClientResponse; import org.springframework.web.reactive.function.client.WebClient; import org.springframework.web.reactive.function.server.RouterFunction; import org.springframework.web.reactive.function.server.RouterFunctions; import org.springframework.web.reactive.function.server.ServerRequest; import org.springframework.web.reactive.function.server.ServerResponse; import reactor.core.publisher.Mono;

import static org.springframework.web.reactive.function.server.RequestPredicates.path;

@SpringBootApplication @Slf4j public class Main { public static void main(String[] args) { SpringApplication.run(Main.class, args); }

@Bean
RouterFunction<ServerResponse> proxyRouter() {
    return RouterFunctions.route(
            path("/proxy"),
            this::callTarget
    );
}

private final WebClient client = WebClient.create();

private Mono<ServerResponse> callTarget(ServerRequest request) {
    return client
            .method(request.method())
            .uri("http://backend1.alexei.cluster:10010/dummy-service")
            .body(request.bodyToFlux(DataBuffer.class), DataBuffer.class)
            .exchange()
            .flatMap(this::toServerResponse)
            .onErrorResume(error -> {
                log.warn("Backend error", error);
                return ServerResponse.status(HttpStatus.BAD_GATEWAY).syncBody("Backend is not available");
            });
}

private Mono<ServerResponse> toServerResponse(ClientResponse response) {
    return ServerResponse
            .status(response.statusCode())
            .headers(headers -> headers.addAll(response.headers().asHttpHeaders()))
            .body(response.bodyToFlux(DataBuffer.class), DataBuffer.class);
}

}

* Here are the gradle dependencies:

dependencies { compile 'org.springframework.boot:spring-boot-starter-webflux' compile 'org.projectlombok:lombok:1.18.2' compile 'io.projectreactor.ipc:reactor-netty:0.7.11.RELEASE' }

* I run a load test of 200 concurrent calls to the proxy like this:

hey -D data.json http://router1.alexei.cluster:8080/proxy

`data.json` is 47k big. `hey` is from https://github.com/rakyll/hey

This is a snippet of the response from `hey`

Status code distribution: [200] 197 responses [502] 3 responses

and here is one of the three exceptions of the proxy

2018-11-06 12:41:15.517 WARN 10177 --- [client-epoll-12] com.test.webflux.Main : Backend error

java.io.IOException: Connection closed prematurely at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:272) ~[reactor-netty-0.7.11.RELEASE.jar!/:0.7.11.RELEASE] at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115) ~[reactor-netty-0.7.11.RELEASE.jar!/:0.7.11.RELEASE] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.handler.codec.http.HttpContentDecoder.channelInactive(HttpContentDecoder.java:205) ~[netty-codec-http-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:315) ~[netty-transport-native-epoll-4.1.29.Final-linux-x86_64.jar!/:4.1.29.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.29.Final.jar!/:4.1.29.Final] at java.base/java.lang.Thread.run(Thread.java:844) ~[na:na]

violetagg commented 6 years ago

@sam701 I was able to reproduce the issue - still debugging - definitely not the same issue.

violetagg commented 6 years ago

@sam701 Can you try adding a header Content-Length or Transfer-Encoding when you send the request from the proxy to the backend and then tell us the result? For example:

private Mono<ServerResponse> callTarget(ServerRequest request) {
    return client
            .method(request.method())
            .uri("http://backend1.alexei.cluster:10010/dummy-service")
---->>>     .header("Content-Length", "46899")
            .body(request.bodyToFlux(DataBuffer.class), DataBuffer.class)
            .exchange()
            .flatMap(this::toServerResponse)
            .onErrorResume(error -> {
                log.warn("Backend error", error);
                return ServerResponse.status(HttpStatus.BAD_GATEWAY).syncBody("Backend is not available");
            });
}
bcelenk commented 6 years ago

https://github.com/reactor/reactor-netty/pull/483#issuecomment-437323531

marcguilera commented 6 years ago

We use SpringBoot 2.0.3.RELEASE with WebFlux and have tried with reactor-netty:0.7.11.RELEASE in the classpath but it doesn't seem to have resolved the issue. Is there any update on the fix or a possible workaround for the time being?

violetagg commented 6 years ago

@marcguilera What is your use case the one described by @bcelenk or the one described by @sam701 ? That exception might have different causes.

marcguilera commented 6 years ago

Our issue is like the one by @bcelenk upgrading to reactor-netty:0.7.11.RELEASE might have helped since we see fewer errors but we are still seeing some.

violetagg commented 6 years ago

@marcguilera Are you able to extract some reproducible scenario? It will be very helpful.

lukaszpy commented 5 years ago

Hi,

I have faced same issue, with upgraded reactor-netty:0.7.11.RELEASE. I have also tried to use use NIO instead of Epoll with:

@Bean
    WebClient.Builder loadBalancedWebClient(){
        return WebClient.builder()
                .clientConnector(new ReactorClientHttpConnector(options -> options.preferNative(false)));
    }

but what I see in stacktrace it still use Epoll (don't know if I do it right). Also I don't have scenario which reproduce this error - getting it with time to time (like unicorn ;) )

java.io.IOException: Connection closed prematurely
at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:269)
at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:113)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.handler.codec.http.HttpContentDecoder.channelInactive(HttpContentDecoder.java:205)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at i.n.c.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282)
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)\r\nat io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:313)\r\nat io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
... 1 common frames omittedWrapped by: java.lang.RuntimeException: java.io.IOException: Connection closed
dejewskipiotr commented 5 years ago

We have exactly same problem, we tried upgrading to spring boot 2.1 which upgraded reactor netty version to 0.8.2 The only thing that changed is the exception intead of IOException we get reactor.netty.http.client.HttpClientOperations$PrematureCloseException: Connection prematurely closed BEFORE response

it might be irrelevant to this issue but we started also experiancing "Pooled connection observed an error" error

sharathsrinivasa commented 5 years ago

I ran into similar issue when I was trying to load test reactor netty under different GET/POST scenarios. Here is my little effort to reproduce the issue. It is not 100% consistent, but I do see the above issue. It felt like a concurrency issue(I'm not sure, please correct me here).

Let me know your thoughts.

lukaszpy commented 5 years ago

What is interesting - because I cannot alow to go with this issue on production. So I have returned to my previous solution which is not affected with this issue.

Previous solution:

Current - affected solution:

I don't know if @LoadBalanced proxy hidding this issue but when I searching in logs with this solution i cannot find any IOException (previously I have around one per hour)

violetagg commented 5 years ago

@lukaszpy Any reproducible scenario that you can provide will be very helpful. This exception means the server closes the connection. However till now we identified two scenarios: the first reported by @bcelenk, the second reported by @sam701 . Your scenario seems a bit different.

violetagg commented 5 years ago

@sam701 Try with 0.7.12.BUILD-SNAPSHOT, your use case should be OK now.

lukaszpy commented 5 years ago

@violetagg - I have created sample projects (server and client - it exactly works as our services). https://github.com/lukaszpy/db-client - simulate clients https://github.com/lukaszpy/db-netty - simulate acces to DB (server) The error is hard to reproduce on our test env I need to deploy few instances of client, and run load tests. Just also wondering about scenario: A, B, C where C is server and A, B are clients. what happend if A -> C and then B -> C (load, and take all connection on poll in server C), then A try to reuse connection (still active, no time to change status).

sharathsrinivasa commented 5 years ago

@violetagg Was my sample example helpful in reproducing this issue?

violetagg commented 5 years ago

@sharathsrinivasa I'm not able to reproduce it using the provided scenario. Can you share you environment details - OS, JAVA version etc.

dejewskipiotr commented 5 years ago

In our case upgrading to netty reactor 0.8.3 seems to fixed all our issues

violetagg commented 5 years ago

We have a fix for the use case that was reported by @sam701 So all, please try the new versions 0.7.12.RELEASE/0.8.3.RELEASE

Thank you, Violeta

sam701 commented 5 years ago

@violetagg great news, thanks! I'm currently not working on this topic any more but I'll pass the message further.