helidon-io / helidon

Java libraries for writing microservices
https://helidon.io
Apache License 2.0
3.52k stars 564 forks source link

Helidon http server goes down in the scenario being proxy layer for high amount of concurrent requests #8482

Closed RassulYunussov closed 7 months ago

RassulYunussov commented 7 months ago

The Helidon http server goes down during load test in the scenario of being a proxy layer.

Environment Details


Problem Description

The purpose of test was to compare throughput of Helidon http server for the same scenario I had for tomcat/netty/go in past: https://github.com/filipemunhoz/performance-api-webflux-vs-mvc-vs-golang/pull/2

For that the quick startup project from Helidon examples was used and crafted for the same scenario:

With the parameters of 2000 RPS Helidon http server becomes unresponsive and doesn't restore.

Source: https://github.com/RassulYunussov/Helidon-benchmark

spericas commented 7 months ago

@RassulYunussov I suggest that you switch and use Helidon SE instead of Helidon MP. Helidon MP is based on the Jakarta set of specifications and requires CDI and other libraries that are not ideal to test performance (especially not compared to more "basic" frameworks). You can generate a quickstart for SE as well. Having said that, I will still try to reproduce and investigate the problem you report here.

spericas commented 7 months ago

I've tried this on my local Macbook (older model). I don't actually see the server hanging, but I do see connection resets at higher number of rates in vegeta. Seems to work fine below up to 200 or so. After that, some connections are getting closed and you see client errors,

Get "http://localhost:8084/performance-helidon?delay=100": read tcp 127.0.0.1:60968->127.0.0.1:8084: read: connection reset by peer

and server exceptions,

Caused by: java.net.SocketException: Connection reset by peer
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:598)
        at java.base/java.net.Socket.connect(Socket.java:751)
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:178)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636)
        at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:280)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:386)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:408)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1304)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1237)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1123)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1052)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1675)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1599)
        at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:531)
        at org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:423)
        at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:268)
        ... 62 more

We can dig a bit deeper, but given that we should probably be using Helidon SE, it may not be worth our time. We should also try this on Linux. Again, the server does not hang for me, that could be something specific to the M2 HW.

My suggestion is to switch Helidon SE and try this again, and also to try it on Linux. I'm attaching my (slightly simplified) project that uses Helidon 4.0.6.

Helidon-benchmark.zip

RassulYunussov commented 7 months ago

Hi @spericas !

Thank you for your suggestion. But I believe the issue will repeat with Helidon SE. The main problem there is with HTTP-CLIENT. It looks like it acquires all of the ports on both ends doesn't release.

Worth to add, is that after the load is completed, the server doesn't recuperate. So single requests against the server still fail.

romain-grecourt commented 7 months ago

You're using MicroProfile Rest Client, which means under the hood Jersey's default client connector based on HttpURLConnection.

Performance testing on macOS makes little sense and isn't worth our time.

Please use Linux and also ideally Helidon WebServer + Helidon WebClient. If you can reproduce the issue we can prioritize accordingly.

spericas commented 7 months ago

@RassulYunussov As Romain says, you should use WebClient instead. Let us know if you need help.

RassulYunussov commented 7 months ago

Hi @spericas Yes, replaced by WebClient, now all works. Thank you!

I've updated my repo with WebClient.

Does it mean, that it is not recommended to use "MicroProfile Rest Client, which means under the hood Jersey's default client connector based on HttpURLConnection"?

spericas commented 7 months ago

@RassulYunussov Please switch to Helidon SE as well for perf testing. I believe HttpURLConnection has limits and may be problematic in high-throughput scenarios. Closing this issue, if you need, we can follow up on Slack.