reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty
https://projectreactor.io
Apache License 2.0
2.59k stars 647 forks source link

Hot thread(reactor-http-epoll) appears when WebClient requests back-end services #1954

Closed davidyangss closed 2 years ago

davidyangss commented 2 years ago

🐞 Hot thread(reactor-http-epoll) appears when WebClient requests back-end services

🎁 Reasons for the emergence of hot threads, The hot thread of this stress test is reactor-http-epoll-6

🎁 My advice

🎁 The above is my analysis, it may be accurate, please correct me. thanks

violetagg commented 2 years ago

@davidyangss Can you provide a complete minimal sample (something that we can unzip or git clone, build, and deploy) that reproduces the problem.

davidyangss commented 2 years ago

OK, but it will take some time. I will provide a demo

davidyangss commented 2 years ago

I put the demo into the repo: https://github.com/davidyangss/demo.git @violetagg

davidyangss commented 2 years ago

image This may be another factor。 When I tested spring-cloud-gateway 3.1.0, I also found the same problem. Especially when the number of connections exceeds 10,000, the problem is obvious

pderop commented 2 years ago

Hi @davidyangss,

Thanks for sharing your sample app.

Are you using "ab" intentionally ? This tool only supports HTTP/1.0 and I got a tons of TIME_WAIT sockets while loading your demo project. Could you use another loader tool which supports HTTP/1.1 + keep alive, like wrk2, gatling, vegeta, or jmeter for example ?

let us know, thanks.

davidyangss commented 2 years ago

Hi, @pderop ,

  1. intentionally? no! I use wrk2, the case must appear! I add wrk.sh to the repo "https://github.com/davidyangss/demo.git". "Created a new pooled channel" in one or two threads. According to my understanding, that should be evenly distributed among threads. When I tested spring-cloud-gateway 3.1.0, I also found the same problem.
  2. My English is poor, but I still want to describe my analysis of this case. SimpleDequePool is very cleverly designed. But when combined with ColocatedEventLoopGroup, the case will appear. When one thread A executes the drainLoop of SimpleDequePool, many other threads execute doAcquire. Thread A will create resources for other threads. For reactor-netty, thread A is reactor-http-nio. A reactor-http-nio thread will create most of the Http connections. But ColocatedEventLoopGroup use localLoop. So almost all netty events will be concentrated on a reactor-http-nio.
  3. The above is my opinion. spring-cloud-gateway 3.1.0 (reactor-netty-core-1.0.13.RELEASE) is also like this. Use reactor-netty-0.8.10.RELEASE, very good.
pderop commented 2 years ago

Hi @davidyangss,

Thanks for having insisted on this issue, I could finally reproduce well the issue using vegeta tool. Now we have a fix available in the 1.0.16-SNAPSHOT, that would be nice if you could give it a try ?