jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.86k stars 1.91k forks source link

Excessive heap consumption by SSLSessionImpl by Jetty Server with TLS 1.3 and long-lived client #12284

Closed samfrown closed 1 month ago

samfrown commented 1 month ago

Jetty version(s) 12.0.13 (all >12.0.3)

Jetty Environment core

Java version/vendor (use: java -version) JVM: OpenJDK 64-Bit Server VM (21+35-LTS, mixed mode, sharing) Java: version 21 2023-09-19 LTS, vendor Amazon.com Inc.

OS type/version Rocky Linux 8/Docker

Description We noticed the issue when the heap usage of a jetty-based service with TLS port was increasing constantly until the max limit. The gathered heap dump shows a lot of SSLSessionImpl objects (more that ssl session cache size limit). All of the "swollen" sessions are sourced from HAProxies.
swolen_sslsessionimpl_dump

Later the same situation was reproduced with just one HAProxy as a client which:

The Eclipse MAT reports the following:

One instance of sun.security.ssl.SSLSessionContextImpl loaded by occupies 129,661,568 (84.23%) bytes. The instance is referenced by org.eclipse.jetty.io.EndPoint$SslSessionData$1 @ 0xebdaab60, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xeac81998. The memory is accumulated in one instance of sun.security.ssl.SSLSessionContextImpl, loaded by , which occupies 129,661,568 (84.23%) bytes.

The SSLSessionImpl objects seems retained under the soft referenced session cache entries and are cleaned up finally when heap reaches the max memory limit. But still this excessive heap consumption looks suspicious.

Also the issue isn't reproduced with Jetty 12.0.3. Before 12.0.3 ssl session data was hold inside the request (SecurityRequestCustomizer). Later the org.eclipse.jetty.io.EndPoint$SslSessionData was introduced to hold the data inside the SSLSession itself, which may be cause the issue.

How to reproduce?

  1. Start Jetty Service with TLS enabled.
  2. Configure HAProxy backend with HTTP get ssl with short interval: "check-ssl inter 100ms"
  3. Don't reload HAProxy for some time while waiting for heap growth.

To mitigate the issue

Either:

  1. Use TLS 1.2
  2. Reload HAProxy or restart client
  3. Use Jetty 12.0.3
samfrown commented 1 month ago

I have a heap dump but it's bigger than allowed attachment size :(

samfrown commented 1 month ago

heapdump-h1only-after-4h-1726149887662_Leak_Suspects.zip heapdump-h1only-after-4h-1726149887662_Top_Components.zip

sbordet commented 1 month ago

@samfrown please send heap dump at: sbordet ~at~ webtide.com.

samfrown commented 1 month ago

@samfrown please send heap dump at: sbordet ~at~ webtide.com.

sent by gmail

sbordet commented 1 month ago

@samfrown just to be sure I understand, you are bombing the server with 1 health check every 100 ms?

What is your configuration (if you have specified one) for the SSLSession cache size and SSLSession timeout? I see from the dump respectively 1000 and 86400 seconds, looks like you changed the default for the cache size?

If we make a PR where we do not store the SslSessionData in the SSLSession, would you be able to try it out?

samfrown commented 1 month ago

Right, it were just HAP health checks (for that particular dump). And, yes, I tried to change cache size and session timeout with -Djavax.net.ssl.sessionTimeout=600 -Djavax.net.ssl.sessionCacheSize=1000.

If we make a PR where we do not store the SslSessionData in the SSLSession, would you be able to try it out?

Yes, I will.

sbordet commented 1 month ago

Just FTR, the SSLSession cache has always been a problem not only for Jetty but for Java in general.

The behavior you see is normal: SSLSession instances are wrapped into SoftReferences, and only when you run low on heap they will be cleared. This mechanism is in Java itself and we cannot do anything about.

Having said that, try this PR: #12288.

samfrown commented 1 month ago

Having said that, try this PR: https://github.com/jetty/jetty.project/pull/12288.

Hi. I sent the dump from PR version by mail. It looks much better.

sbordet commented 1 month ago

See also #4923 for context.