wildfly-security / wildfly-openssl

Generic OpenSSL bindings for Java
Apache License 2.0
81 stars 72 forks source link

Wildfly OpenSSL causes a JVM crash under concurrent load #122

Open marpidone-mim opened 2 years ago

marpidone-mim commented 2 years ago

We've been experiencing periodic crashes in our software that we believe to be caused by the Wildfly OpenSSL library.

We've isolated this down to a simple test case that doesn't rely on any of our app-specific code. It starts up a separate JVM that runs the server, and then spins up N concurrent client threads that each attempt to open a connection to the server and send a couple integers back and forth.

Test app that demonstrates the problem: https://github.com/marpidone-mim/wildfly-ssl-crash

The test often causes a JVM crash in the server very quickly, and the stack always looks like this:

Current thread (0x00000170a797f800):  JavaThread "pool-1-thread-9" [_thread_in_native, id=21052, stack(0x0000001a33100000,0x0000001a33200000)]

Stack: [0x0000001a33100000,0x0000001a33200000],  sp=0x0000001a331fe5d0,  free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libcrypto-3-x64.dll+0x2cc85f]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.wildfly.openssl.SSLImpl.setSSLVerify0(JII)V+0
j  org.wildfly.openssl.SSLImpl.setSSLVerify(JII)V+4
j  org.wildfly.openssl.OpenSSLEngine.lambda$setSSLParameters$4(Ljavax/net/ssl/SSLParameters;)V+65
j  org.wildfly.openssl.OpenSSLEngine$$Lambda$51.run()V+8
j  org.wildfly.openssl.OpenSSLEngine.initSsl()V+118
j  org.wildfly.openssl.OpenSSLEngine.unwrap(Ljava/nio/ByteBuffer;[Ljava/nio/ByteBuffer;II)Ljavax/net/ssl/SSLEngineResult;+167
j  javax.net.ssl.SSLEngine.unwrap(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)Ljavax/net/ssl/SSLEngineResult;+12 java.base@11.0.15
j  org.wildfly.openssl.OpenSSLSocket.runHandshake()V+354
j  org.wildfly.openssl.OpenSSLSocket.read([BII)I+73
j  org.wildfly.openssl.OpenSSLSocket.read([B)I+5
j  org.wildfly.openssl.OpenSSLSocket.read()I+6
j  org.wildfly.openssl.OpenSSLInputStream.read()I+4
j  ssl.test.Server.lambda$handleNextConnection$1(Ljava/net/Socket;)V+11
j  ssl.test.Server$$Lambda$52.run()V+4
j  java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@11.0.15
j  java.util.concurrent.FutureTask.run()V+39 java.base@11.0.15
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@11.0.15
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@11.0.15
j  java.lang.Thread.run()V+11 java.base@11.0.15
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x0000000000000008

Workaround that seems to prevent the crash We discovered with some investigation that adding a call to SSLSocket.getHandshakeSession() in the server-side code immediately after accepting the socket connection prevents the crash from happening. Check Server.handleNextConnection for a spot where I added this line and commented it out. Uncommenting it seems to prevent the crash.

marpidone-mim commented 2 years ago

Oh also, I forgot to mention, the issue seems to only happen on Windows. We can't reproduce it in OSX/linux.