Open MASHtm opened 8 years ago
These connections are established by backend_connect()
, and their tls setup is handled by backend_starttls()
, both in imap/backend.c
.
I don't (yet) know anything about tls session caching, but this seems to be the starting point.
These functions don't look like they've changed appreciably since master diverged from 2.5, so this probably affects master too.
It looks like the client side tls session cache completely breaks with OpenSSL 1.1.1. Frontend processes (imapd, lmtpd, ....) die with SIGSEGV if a session is "stored" and reapplied with SSL_set_session() in tls.c/tls_start_clienttls().
IMO it is not valid to keep a pointer to a structure returned by SSL_get_session(). The man page says that only SSL_get1_session() increases the use counter and prevents invalidation of the struct by external means like aborted connections.
But tls_start_clienttls() does store the pointer from SSL_get_session() and reuses it for SSL_set_session().
With OpenSSL 1.1.1 this leads to the following backtrace:
#0 0x00007fca8b27fcea in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
#1 0x00007fca899cc549 in CRYPTO_THREAD_write_lock () from /usr/local/openssl/lib64/libcrypto.so.1.1.1
#2 0x00007fca899cc58b in CRYPTO_atomic_add () from /usr/local/openssl/lib64/libcrypto.so.1.1.1
#3 0x00007fca89cef081 in SSL_SESSION_up_ref () from /usr/local/openssl/lib64/libssl.so.1.1.1
#4 0x00007fca89cf00de in SSL_set_session () from /usr/local/openssl/lib64/libssl.so.1.1.1
#5 0x00007fca8a72c7d5 in tls_start_clienttls (readfd=20, writefd=20, layerbits=0x556d166e56c8, authid=0x7ffc3f2f9268, ret=0x556d166e56d0,
sess=0x556d166e56d8) at imap/tls.c:1576
#6 0x00007fca8a6efc57 in backend_starttls (s=0x556d166e4dc0, tls_cmd=<value optimized out>, c_cert_file=0x0, c_key_file=0x0)
at imap/backend.c:524
#7 0x00007fca8a6f0c71 in backend_authenticate (ret_backend=0x556d166e4dc0, server=0x556d16690d70 "xxxxx.univie.ac.at", prot=0x556d14e475c0,
userid=0x556d166e21f0 "xxxxxxxx", cb=0x556d166e38f0, auth_status=0x0, logfd=-1) at imap/backend.c:725
#8 backend_login (ret_backend=0x556d166e4dc0, server=0x556d16690d70 "xxxxx.univie.ac.at", prot=0x556d14e475c0,
userid=0x556d166e21f0 "xxxxxxxx", cb=0x556d166e38f0, auth_status=0x0, logfd=-1) at imap/backend.c:781
#9 backend_connect (ret_backend=0x556d166e4dc0, server=0x556d16690d70 "xxxxx.univie.ac.at", prot=0x556d14e475c0,
userid=0x556d166e21f0 "xxxxxxxx", cb=0x556d166e38f0, auth_status=0x0, logfd=-1) at imap/backend.c:1030
#10 0x0000556d14c3e6d8 in proxy_findserver (server=0x556d16690d70 "xxxxx.univie.ac.at", prot=0x556d14e475c0, userid=0x556d166e21f0 "xxxxxxxxx",
cache=0x556d14e488d0, current=0x556d14e488c8, inbox=0x556d14e488c0, clientin=0x556d166e1690) at imap/proxy.c:173
#11 0x0000556d14c31231 in cmd_status (tag=0x556d166b9530 "5", name=0x556d166e5a70 "INBOX.Junk") at imap/imapd.c:7804
#12 0x0000556d14c3b8ec in cmdloop () at imap/imapd.c:2027
#13 0x0000556d14c3de21 in service_main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
at imap/imapd.c:984
#14 0x0000556d14c3f20a in main (argc=<value optimized out>, argv=<value optimized out>, envp=0x7ffc3f2ffe60) at master/service.c:606
I currently disabled session cache completely with:
--- cyrus-imapd-2.5.12/imap/tls.c.orig 2018-10-22 16:50:01.414997617 +0200
+++ cyrus-imapd-2.5.12/imap/tls.c 2018-10-22 16:50:46.677141239 +0200
@@ -1572,8 +1572,8 @@
if (var_proxy_tls_loglevel >= 3)
do_dump = 1;
- if (sess && *sess) /* Reuse a session if we have one */
- SSL_set_session(tls_conn, *sess);
+// if (sess && *sess) /* Reuse a session if we have one */
+// SSL_set_session(tls_conn, *sess);
if ((sts = SSL_connect(tls_conn)) <= 0) {
SSL_SESSION *session = SSL_get_session(tls_conn);
This prevents the SIGSEGV and brings back stable operation.
I did not try to use SSL_get1_session() yet.
Oh, good find!
TLS 1.3 (introduced in OpenSSL 1.1.1) removed support for the type of session reuse that we have implemented in Cyrus, and introduced a new type. So at the moment, Cyrus with TLS 1.3 won't ever reuse sessions.
As to how it works, I don't know much about it yet. Looking around, the curl project implemented it like this: https://github.com/curl/curl/commit/549310e907e82e44c59548351d4c6ac4aaada114. But I think they're a client-only project, so it might be quite different for us.
There should be a fix specifically for the segfault shortly, from #3191. It won't enable TLS 1.3-style session use, but it will stop it crashing.
I'm gonna retitle this issue to make it clear what it's about, now that we understand.
Updated labels to reflect that remaining work is a future enhancement, not a bug fix
looking at my logs I recognized that all my frontend->backend connections end up with (xxx/xxx bits new client).
tls_session_cache is active on all backends with same settings as on frontends, were MUAs successfully reuse sessions. So it seems it's the SSL client code of the frontends which fails to submit correct session state.
Looking back in my logs it is true for 2.4 frontends as well.