haskell-tls / hs-tls

TLS/SSL implementation in haskell
Other
403 stars 92 forks source link

Expose peer cert in `Information` data structure #275

Open infinity0 opened 6 years ago

infinity0 commented 6 years ago

In order to support TLS with a custom identity protocol instead of X509, it is nice to expose the peer's certificate in the Information data structure. This allows one to perform some custom verification of the certificates after the connection is established.

onServerCertificate and onClientCertificate could be used for this purpose but it's unclear whether it's safe to write to the stream whilst the TLS handshake is occurring, which is presumably when these functions would be called.

Suppose peer A and B know each other's non-TLS public keys K_A, K_B in some other protocol, and want to set up a TLS connection without using X509. They are still using TLS public keys T_A, T_B, and now want to authenticate their TLS connection by verifying that (the entity who knows the private key of K_A is willing to be represented by T_A) and likewise for B.

The Information data structure currently only exposes clientRandom, serverRandom, and masterSecret. The former two can be coerced to identical values on both sides of a MITM connection so authenticating these provides no security; the masterSecret cannot be coerced to the same value but securely verifying that K_A and K_B both know the same value is equivalent to running a PAKE over it, which can get quite complex.

It is much easier to just have K_A sign T_A and likewise for B, so we want to extract the certificates out of the TLS connection. Currently this is quite fiddly:


tlsClientParams :: etc -> Credential -> TVar (Maybe Certificate) -> ClientParams
tlsClientParams etc cred tvar =
    defaultParamsClient etc
    { clientHooks = hooks
    }
    where
    hooks = def {
        onCertificateRequest = \_ -> return $ Just cred
      , onServerCertificate = \s v i c@(CertificateChain cert) -> do
          let cert' = getCertificate $ head cert
          atomically $ writeTVar tvar $ Just cert'
          return []
      }

tlsServerParams :: Credential -> TVar (Maybe Certificate) -> ServerParams
tlsServerParams cred tvar =
    def {
        serverWantClientCert = True
      , serverCACertificates = []
      , serverHooks          = hooks
      , serverShared         = def { sharedCredentials = Credentials [cred] }
      }
    where
    hooks = def {
        onClientCertificate = \c@(CertificateChain cert) -> do
          let cert' = getCertificate $ head cert
          atomically $ writeTVar tvar $ Just cert'
          return CertificateUsageAccept
      }

hence the reason for this issue request.

This might be related to #133 but I'm not sure.

vdukhovni commented 6 years ago

It is not enough to expose just the leaf certificate. If the certificates are going to be exposed in the Info structure, then it would have to be the complete chain, because the other chain certificates may also be needed for peer authentication. However, one must keep in mind that with session resumption no certificates are exchanged at all, and that in this case the certificates in question need to stored and retrieved from the session cache (assuming they are also needed for peer identity lookups on resumption).

And yet, the certificate chain is bulky, and it makes sense to not store it in the connection context if, as is common, it won't be needed. Therefore indeed capture of the certificate chain is best done via the onServerCertificate and client equivalent calls, exactly as you outlined. There is no issue with updating a suitable IORef in this context. I don't think this requires a TVar or MVar, as there should be a separate such object per connection, and the caller should not be looking at the data until the handshake completes or fails. So locking seems unnecessary.

My DANE survey code uses much the same approach:

TLS.onServerCertificate  = genChainInfo chainRef
...
genChainInfo :: IORef ChainInfo
             -> CertificateStore
             -> ValidationCache
             -> ServiceID
             -> CertificateChain
             -> IO [FailedReason]
genChainInfo chainRef _ _ _ chain = do
    now <- gettime
    writeIORef chainRef $ buildChain now chain
    return [] -- Always succeed

buildChain :: ... -- Gather up the data I want to save

The actual validation (based on TLSA records) happens after the handshake, but of course one could also do "just-in-time" validation, and then return a non-empty failure reason list as appropriate.

@infinity0 ... unclear whether it's safe to write to the stream whilst the TLS handshake is occurring ...

Not sure what you mean by "safe to write to the stream", your callback should just collect the data you wanted to have returned in the info result. That's not "writing to the stream".

infinity0 commented 6 years ago

Not sure what you mean by "safe to write to the stream", your callback should just collect the data you wanted to have returned in the info result.

My context and motivation for opening this issue is to use a custom identity protocol that is not X509. In this context, I don't care about the chain, in fact there is no chain and the certificates are all self-signed. Instead we have another non-X509 protocol to verify the certificates. This protocol needs to perform some communication with the peer. In terms of code engineering, it would be cleanest to perform this communication inside onServerCertificate/onClientCertificate but it's not clear we can write to the stream safely inside these functions. That's what my comment means.

My workaround is to run the protocol after the TLS session is "successfully" set up, which works fine but it does mean onServerCertificate/onClientCertificate have to return OK even though the certificates have not actually been verified properly at this stage.

vdukhovni commented 6 years ago

@infinity0 Instead we have another non-X509 protocol to verify the certificates. This protocol needs to perform some communication with the peer. In terms of code engineering, it would be cleanest to perform this communication inside onServerCertificate/onClientCertificate but it's not clear we can write to the stream safely inside these functions.

You MUST NOT attempt to write to the stream inside these functions, and in fact there is no facility for doing so. Since you don't yet have the established TLS connection handle, and the underlying socket is in the middle of a TLS handshake and any data you read or write from the socket at this time is guaranteed to break the TLS handshake. In order to do the communication "just in time", you'd need to carry TLS records inside your own framing protocol, so that you can insert your own control messages between TLS messages, and then the TLS library would have to support being given TLS packets to read and and handing off TLS packets to write to some application provided transport code that would do the encapsulation to separate TLS and non-TLS messages.

In some cases you could carry additional custom data in TLS extensions, but these don't give you a general purpose communications channel, so it may not be possible to fit your sub-protocol into TLS extension messages (and they'd likely need different code for TLS 1.3 vs 1.2).

Bottom line, complete the handshake and run your protocol after that, exactly as you're doing.

vdukhovni commented 6 years ago

I think this issue can be closed. Do you agree?

infinity0 commented 6 years ago

No, I disagree. The writing-inside-those-functions detail was an aside, and not the main point of the report.

My code snippet is 20+ lines long and took a couple hours to write. If the peer certificates were exposed in Information the snippet would only be 2 lines long and would have taken about 5 minutes to write.

vdukhovni commented 6 years ago

Keeping the certificate chain around for Information to return it substantially increases the memory footprint of the TLS connection context. I'd address this with better documentation, showing examples of how this is done if you need it.

infinity0 commented 6 years ago

The chain doesn't need to be kept around, only the actual ("leaf") certificate used for the connection. The current model is that the chain is verified (or custom-overrode to not be verified) before the connection is set up anyway, so anyone that is looking at Information for the peer certificate wouldn't care about the chain, the verification (or purposeful non-verification) has already happened.

Information already contains clientRandom, serverRandom and masterSecret, so I don't see why it can't contain the actual ("leaf") certificate either.

infinity0 commented 6 years ago

Alternatively, I would also be happy with just the pubkey of the peer certificate, the rest of it in fact can be ditched and is not relevant for my use-case. It might be useful for a different but similar use-case to mine though.

vdukhovni commented 6 years ago

@infinity0 The chain doesn't need to be kept around, only the actual ("leaf") certificate used for the connection.

That's just your use-case, but it does not make for a clean interface. If information about the peer's certificates it to be returned, the interface needs to be fully general and be able to return the whole chain. No half-baked point solutions.

As I said, a fairly elementary closure around an IORef gets you the information you want, and documenting this in case it is not obvious would be my approach.

infinity0 commented 6 years ago

How is a closure that sets an IORef "more clean" than a pure function to grab a pubkey off a dumb data struct?

TLS is at its heart a protocol for establishing a secure stream between 2 public keys. The whole X509 "certificate chain" stuff is an entirely separate concern from that, at a different protocol layer, to link the public keys to whatever "real world" identities you care about. Exposing peer public key is not a "half-baked point solution", it is the cleanest possible interface you can expose to cleanly separate the "secure stream protocol" from the "identity protocol". TLS does not have to be used with X509, and the interface I proposed is the cleanest way of doing that.

vdukhovni commented 6 years ago

That's not what I said, what's more clean is not returning ad-hoc fragments of the peer's chain that satisfy just one use-case.

The issue here is that in order to retain the chain information it has to be saved for the life of the TLS connection, which on a busy system with lots of connections carries a cost.

A closure around an IORef lets you capture the data if you want it. Those who don't want it, don't pay the price. It seems like a fair trade-off to me.

It is trivial to write:

saveKey :: IORef (Maybe PublicKey)
        -> CertificateStore
        -> ValidationCache
        -> ServiceID
        -> CertificateChain
        -> IO [FailedReason]
saveKey keyref _ _ _ (CertificateChain certs) = do
  case certs of
    []     -> writeIORef keyref Nothing -- anonDH or anonECDH?
    cert:_ -> writeIORef keyref $ Just $ certPubKey cert
  return [] -- Always succeed
...
   keyref = newIORef Nothing
   onServerCertificate = saveKey keyref
   -- do handshake, then:
   key <- readIORef keyref

[ Note, the content of keyref will be Nothing with either session resumption (when no certificates are exchanged, and the callback is not made) or when (if enabled at both ends) an anonymous DH/ECDH ciphersuite is selected. (Anonymous ciphersuites are gone from TLS 1.3). ]

infinity0 commented 6 years ago

I understand my use case is not a common use-case, but I am arguing that it is a natural one because it is at the point-of-separation between TLS and X509:

TLS as a cryptographic protocol only cares about the peer pubkeys, that I am suggesting should be exposed. It does not need to know anything above the peer certificates in order to operate (certificate authorities, certificate expiry dates, etc)

Likewise, X509 as an identity verification protocol only cares about the peer pubkeys. In order for it to do its work, it does not need to know anything relating to TLS (cipher suites, handshakes, heartbeats, etc).

Therefore it would be beneficial to expose what I am suggesting in the API because it is one of the most general and cleanest ways you can make the protocol composable with other protocols.