elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.41k stars 24.87k forks source link

Make SSL/TLS handshake errors actionable. #29817

Open elasticmachine opened 7 years ago

elasticmachine commented 7 years ago

Original comment by @jordansissel:

SSL errors in all languages are generally unhelpful for users/operators, in my experience.

Elasticsearch, for example, when there is an SSL exception, logs a 50-100-line stack trace. Embedded in this noise is some signal "PKIX path builder problem" that, quite honestly, no operator will understand.

We can do better.

For the past year, in my spare cycles, I have been working on solving this problem for Logstash, and I believe Elasticsearch should solve this as well. We can share the code.

As an example, when a client connects and does not trust the server:

io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:571)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:512)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:426)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:398)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:748)
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
    at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1478)
    at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
    at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1094)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:966)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:900)

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
    ... 16 more
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
    at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
    at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:304)
    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026)
    at sun.security.ssl.Handshaker$1.run(Handshaker.java:966)
    at sun.security.ssl.Handshaker$1.run(Handshaker.java:963)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1416)
    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1120)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1005)
    ... 18 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:397)
    at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:302)
    at sun.security.validator.Validator.validate(Validator.java:260)
    at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
    at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:281)
    at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1501)
    ... 26 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
    at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
    at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
    at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:392)
    ... 32 more
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: The remote server provided an unknown/untrusted certificate chain, so the connection terminated by the client.
The local client has 105 certificates in the trust store.
Here is a network log before the failure:
OUTPUT: Handshake message: ClientHello[29 cipher suites; suites: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA, TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
INPUT: Handshake message: ServerHello[TLS 1.2, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, keyex ECDHE_RSA]
Handshake message: CertificateMessage[1 certificates]
0: CN=local
  subjectAlt: DNS:localhost
  subjectAlt: IP:127.0.0.1
Handshake message: ServerKeyExchange
Handshake message: ServerHelloDone

For my prototype, there is much work to be done to improve the failure reporting, but this is significantly better than the default SSLEngine/SSLSocket exception stack traces.

My code is here: https://github.com/elastic/tealess/

My intent is to deploy this with Logstash, but I believe strongly that Elasticsearch must use this or a similar strategy to improve SSL/TLS user experience problems.

elasticmachine commented 7 years ago

Original comment by @jaymode:

@tvernum @albertzaharovits @joshbressers I definitely think that this is something we should have before we push TLS on the masses with the upcoming move to enabling it for basic licenses.

elasticmachine commented 7 years ago

Original comment by @joshbressers:

This is fantastic, we should do this everywhere.