reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty
https://projectreactor.io
Apache License 2.0
2.59k stars 647 forks source link

Nginx report error "connect() failed (110: Connection timed out) while connecting to upstream" #1099

Closed spiritme1984 closed 2 years ago

spiritme1984 commented 4 years ago

Describe the bug My project use springcloud gateway as api gateway. Before gateway is Nginx server. The request flow is "request -> nginx -> springcloud gateway -> Haproxy -> service". The Nginx report error "connect() failed (110: Connection timed out) while connecting to upstream" by high concurrencys (4000 QPS). But if we remove the gateway this error was missing.

Environment Cpu : 8C Memory : 8G Spring boot : 2.2.6.RELEASE Spring cloud : Hoxton.SR3 reactor-netty : reactor-netty : 0.9.6 reactor-core : 3.3.4

Sample I uploaded an example on : https://github.com/spiritme1984/springCloudGateway

violetagg commented 4 years ago

@spiritme1984 Do you have Nginx and Spring Gateway on one and the same host? Check this https://github.com/reactor/reactor-netty/issues/1038

spiritme1984 commented 4 years ago

@spiritme1984 Do you have Nginx and Spring Gateway on one and the same host? Check this #1038

I'm afraid not. They are on the different hosts.

violetagg commented 4 years ago

@spiritme1984 that's good. What's the connection pool type that you use - elastic or fixed?

spiritme1984 commented 4 years ago

@spiritme1984 that's good. What's the connection pool type that you use - elastic or fixed?

Both of them I'd tried and seems there is no help for this error.

violetagg commented 4 years ago

@spiritme1984 so without Nginx you are able to execute your load test successfully? Are the numbers as expected or you notice some peaks in the response times? Is it possible that you configured Nginx with some aggressive timeouts?

spiritme1984 commented 4 years ago

@spiritme1984 so without Nginx you are able to execute your load test successfully? Are the numbers as expected or you notice some peaks in the response times? Is it possible that you configured Nginx with some aggressive timeouts?

@violetagg Thanks for reply. I haven't tested without Nginx since our Nginx is work for all services but only pary of them go throuth the gateway. I've checked the response times and it was tens of milliseconds in normal but thousands of milliseconds in high concurrencys, even a few requests were timeout(gateway timeout is 30S). But none error log came out(except timeout exception). Also I set Nginx proxy_read_timeout as 60S. Besides, Grafana system displayed in high concurrencys, CPU up to 400% last about 2 mins and then went down to 18%(Nginx report Connection timed out at this period of time) and after about 40S the cpu go up again. In general, cpu picture likes Sine wave, Nginx reports error when the Sine wave down...

violetagg commented 4 years ago

I haven't tested without Nginx

Are you able to test this?

spiritme1984 commented 4 years ago

I haven't tested without Nginx

Are you able to test this?

Sure. After we removed the Nginx server, Jmeter's test result as below:

A : Jmeter : Number of threads 2000 connection pool : elastic Type of error :

  1. Non HTTP response code: org.apache.http.conn.HttpHostConnectException/Non HTTP response message: Connect to 172.30.44.231:8766 [/172.30.44.231] failed: Connection timed out (Connection timed out)
  2. Non HTTP response code: java.net.SocketException/Non HTTP response message: Connection reset

B : Jmeter : Number of threads 2000 connection pool : Fixed with 1000 Type of error :

  1. Non HTTP response code: org.apache.http.conn.HttpHostConnectException/Non HTTP response message: Connect to 172.30.44.231:8766 [/172.30.44.231] failed: Connection timed out (Connection timed out)

In gateway project we did't get any error message.

violetagg commented 4 years ago

@spiritme1984 Do you have any custom filters on Spring Gateway, any blocking code etc.?

spiritme1984 commented 4 years ago

@spiritme1984 Do you have any custom filters on Spring Gateway, any blocking code etc.?

I removed all custom filters but DynamicRouterFilter. Don't find any blocking code. You can check it on sample project. I suppose mayby somewhere in reactor-netty block the connection at high concurrencys.

violetagg commented 3 years ago

@spiritme1984 Is it possible to test with the latest releases for Reactor Netty and Spring Gateway?

zkcarterlau commented 3 years ago

Did you solve it?I met the same problem when I set springcloud-geteway as an upstream behind the nginx.And The nginx periodically reports "upstream time out" in its error log.

violetagg commented 3 years ago

@zkcarterlau What versions do you use, do you have reproducible example?

violetagg commented 2 years ago

Closing this issue as there is no enough information in order to proceed with the investigation