netty / netty

Netty project - an event-driven asynchronous network application framework
http://netty.io
Apache License 2.0
33.47k stars 15.94k forks source link

Client certificate validation via TrustManager interfaces not offloaded from event loop with OpenSSLEngine #13217

Open uncbailey1 opened 1 year ago

uncbailey1 commented 1 year ago

We're using vertx.io, which recently delivered the ability to leverage the enhancement in netty/netty#8847 to allow blocking SSL operations to be offloaded from the event loop via vertx v4.3.8 see eclipse-vertx/vertx#4566. We are doing client certificate validation with OCSP revocation checking, so certificate validation can be long running and we need to avoid performing this work on the event loop. After we picked up the vertx changes, we tested using the JdkSslEngine and we saw the correct behavior in that all certificate validation operations were performed on the event loop; however, when we moved to OpenSSLEngine (which is a requirement for us going forward), we no longer see the certificate validation being offloaded. Note that we do see some SSL operations being offloaded, but not the certificate validation.

As I understand it, the ssl engine provides a hint to netty about when operations could be offloaded, so its not clear whether this is a netty bug or whether this is expected behavior with the OpenSSLengine.

Expected behavior

When using OpenSslEngine and specifying a delegatedTaskExecutor to create an sslHandler, we expected to see TrustManager calls to validate a client certificate (specifically X509ExtendedTrustManager::checkClientTrusted) being executed by the delegated executor and not on the event loop.

Actual behavior

This is a stack trace of a case where the checkClientTrusted routine is being executed on an event loop. note that this is using OpenSSLEngine and specifying an internal worker pool for the executor. When we run the same test with the JdkSslEngine, we see proper behavior (see stack trace below).

Feb 13 12:54:14 -B control-path[13519]: [DEBUG] [] [TrustOptions|vert.x-eventloop-thread-0] Start Blocking Client Cert Validation
Feb 13 12:54:14 -B control-path[13519]: java.lang.Throwable
Feb 13 12:54:14 -B control-path[13519]: at CycTrustOptions$CycTrustManager.checkClientTrusted(TrustOptions.java:334)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslServerContext$ExtendedTrustManagerVerifyCallback.verify(ReferenceCountedOpenSslServerContext.java:276)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslContext$AbstractCertificateVerifier.verify(ReferenceCountedOpenSslContext.java:779)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.internal.tcnative.SSL.readFromSSL(Native Method)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.readPlaintextData(ReferenceCountedOpenSslEngine.java:657)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1267)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1404)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1447)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:222)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1343)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1247)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1287)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
Feb 13 12:54:14 -B control-path[13519]: at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
Feb 13 12:54:15 -B control-path[13519]: at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
Feb 13 12:54:15 -B control-path[13519]: at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
Feb 13 12:54:15 -B control-path[13519]: at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Feb 13 12:54:15 -B control-path[13519]: at java.base/java.lang.Thread.run(Thread.java:829)

This is the behavior we see when utilizing the JdkSslEngine. It properly offloads the certificate processing to a worker thread.

Feb 13 13:10:05 -B control-path[1707]: [DEBUG] [] [TrustOptions|vert.x-internal-blocking-3] Start Blocking Client Cert Validation
Feb 13 13:10:05 -B control-path[1707]: java.lang.Throwable
Feb 13 13:10:05 -B control-path[1707]: at CycTrustOptions$CycTrustManager.checkClientTrusted(CycTrustOptions.java:334)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkClientCerts(CertificateMessage.java:682)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:411)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:375)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061)
Feb 13 13:10:05 -B control-path[1707]: at java.base/java.security.AccessController.doPrivileged(Native Method)
Feb 13 13:10:05 -B control-path[1707]: at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1008)
Feb 13 13:10:05 -B control-path[1707]: at io.netty.handler.ssl.SslHandler$SslTasksRunner.run(SslHandler.java:1787)
Feb 13 13:10:05 -B control-path[1707]: at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
Feb 13 13:10:05 -B control-path[1707]: at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Feb 13 13:10:05 -B control-path[1707]: at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Feb 13 13:10:05 -B control-path[1707]: at java.base/java.lang.Thread.run(Thread.java:829)

Steps to reproduce

Create an SSL Handler using OpenSSLEngine and specify a deletedTaskExecutor, then perform client certificate validation.

Minimal yet complete reproducer code (or URL to code)

I cannot provide a minimal reproducer as we have only investigated this with our proprietary code. I'm looking into trying to create something minimal, but I don't have anything as of now.

The following vertx test does validate the offload of certain operations. It is currently written to leverage the JdkSslEngine, but it is easy enough to modify to use OpenSslEngine. If you change the engine, we do see that most operations are offloaded; however, this test doesn't do client certificate validation.

https://github.com/eclipse-vertx/vert.x/blob/462e3ea5203dd710e52a84bf8fba5f5a5dc5eb46/src/test/java/io/vertx/core/http/HttpTLSTest.java#L1655

Netty version

2.0.52.Final-linux

JVM version (e.g. java -version)

openjdk version "11.0.16" 2022-07-19 OpenJDK Runtime Environment (build 11.0.16+8-suse-150000.3.83.1-x8664) OpenJDK 64-Bit Server VM (build 11.0.16+8-suse-150000.3.83.1-x8664, mixed mode)

OS version (e.g. uname -a)

cat /etc/os-release NAME="SLES" VERSION="15-SP2" VERSION_ID="15.2" PRETTY_NAME="SUSE Linux Enterprise Server 15 SP2" ID="sles" ID_LIKE="suse" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sles:15:sp2"

uncbailey1 commented 1 year ago

Tagging @vietj for visibility.

normanmaurer commented 1 year ago

Will have a look.

uncbailey1 commented 1 year ago

We did some more investigation and wanted to pass along some more info. it looks like the offload of trust manager interfaces were only done for BoringSSL, but we cannot take that path because we need a FIPS certified implementation. its not clear whether OpenSSL cannot support this requirement or it just hasn't been done in tcnative. A colleague of mine did find reference to OpenSSL supporting an async mode which might be helpful here.

adds cert validation offload for BoringSSL alone https://github.com/netty/netty-tcnative/pull/435

normanmaurer commented 1 year ago

@uncbailey1 this is because async cert verify was only added to OpenSSL 3 and only for the client side. We dont support this, so BoringSSL is your only bet atm:

https://www.openssl.org/docs/man3.0/man3/SSL_set_retry_verify.html