Open andrejpodzimek opened 1 year ago
Looking at git diff courier/1.1.10/20220606180754 courier/1.2.0/20221202210553
, the idn2
migration stands out, because this is somehow domain-name-related. But I can’t spot anything “suspicious” at the first glance and my domain or certificate doesn’t contain any non-ASCII characters or anything highly idn2
-relevant.
BTW, idn2
is “stricter” than idn
, it seems, but I don’t have dashes or other problematic characters anywhere (neither in any of the domain names, nor in the domain-specific certificate file / symlink names).
Built and tested Courier 1.1.10
and also 1.1.11
. Still the same problem. So it’s not the idn2
, I suppose. It may be OpenSSL-related.
I’d like to rebuild Courier with OpenSSL 1.1.1s
to figure out whether the recent upgrade to 3.0.7
(in Arch) could be to blame, but can’t get that to compile. Environment variables added in PKGBUILD
:
LDFLAGS+=",-L/usr/lib/courier-authlib,-L/usr/lib/openssl-1.1 -lcourierauth"
CPPFLAGS="-I/usr/include/openssl-1.1 ${CPPFLAGS}"
CFLAGS="-I/usr/include/openssl-1.1 ${CFLAGS}"
Error:
libcouriertls.c: In function 'load_dh_params':
libcouriertls.c:420:17: error: unknown type name 'OSSL_LIB_CTX'
420 | OSSL_LIB_CTX *libctx=OSSL_LIB_CTX_get0_global_default();
| ^~~~~~~~~~~~
libtool: link: ranlib .libs/libspipe.a
libcouriertls.c:420:38: warning: implicit declaration of function 'OSSL_LIB_CTX_get0_global_default' [-Wimplicit-function-declaration]
420 | OSSL_LIB_CTX *libctx=OSSL_LIB_CTX_get0_global_default();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libcouriertls.c:420:38: warning: initialization of 'int *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
mv -f .deps/tlscachetest.Tpo .deps/tlscachetest.Po
libtool: link: ( cd ".libs" && rm -f "libspipe.la" && ln -s "../libspipe.la" "libspipe.la" )
/bin/sh ./libtool --tag=CC --mode=link gcc -I./.. -I.. -I./../.. -I../.. -Wall -I/usr/include/openssl-1.1 -march=native -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -static -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now,-L/usr/lib/courier-authlib,-L/usr/lib/openssl-1.1 -lcourierauth -o tlscachetest tlscachetest.o ../numlib/libnumlib.la ../liblock/liblock.la
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I/usr/include/openssl-1.1 -I/usr/include/p11-kit-1 -I./.. -I.. -I./../.. -I../.. -Wall -I/usr/include/openssl-1.1 -march=native -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -MT tlsclient.lo -MD -MP -MF .deps/tlsclient.Tpo -c tlsclient.c -o tlsclient.o >/dev/null 2>&1
libcouriertls.c:422:32: warning: implicit declaration of function 'PEM_read_bio_Parameters_ex'; did you mean 'PEM_read_bio_Parameters'? [-Wimplicit-function-declaration]
422 | EVP_PKEY *pkey=PEM_read_bio_Parameters_ex(bio, NULL, libctx,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
| PEM_read_bio_Parameters
libcouriertls.c:422:32: warning: initialization of 'EVP_PKEY *' {aka 'struct evp_pkey_st *'} from 'int' makes pointer from integer without a cast [-Wint-conversion]
libcouriertls.c:427:29: warning: implicit declaration of function 'EVP_PKEY_is_a'; did you mean 'EVP_PKEY_sign'? [-Wimplicit-function-declaration]
427 | if (EVP_PKEY_is_a(pkey, "DH"))
| ^~~~~~~~~~~~~
| EVP_PKEY_sign
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I/usr/include/openssl-1.1 -I/usr/include/p11-kit-1 -I./.. -I.. -I./../.. -I../.. -Wall -I/usr/include/openssl-1.1 -march=native -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -MT tlscache.lo -MD -MP -MF .deps/tlscache.Tpo -c tlscache.c -o tlscache.o >/dev/null 2>&1
mv -f .deps/starttls.Tpo .deps/starttls.Po
libcouriertls.c:429:37: warning: implicit declaration of function 'SSL_CTX_set0_tmp_dh_pkey'; did you mean 'SSL_CTX_set_tmp_dh'? [-Wimplicit-function-declaration]
429 | if (SSL_CTX_set0_tmp_dh_pkey(ctx, pkey))
| ^~~~~~~~~~~~~~~~~~~~~~~~
| SSL_CTX_set_tmp_dh
In file included from libcouriertls.h:28,
from libcouriertls.c:9:
libcouriertls.c: In function 'tls_create_int':
/usr/include/openssl-1.1/openssl/ssl.h:1496:52: warning: statement with no effect [-Wunused-value]
1496 | # define SSL_CTX_set_ecdh_auto(dummy, onoff) ((onoff) != 0)
| ^
libcouriertls.c:1082:9: note: in expansion of macro 'SSL_CTX_set_ecdh_auto'
1082 | SSL_CTX_set_ecdh_auto(ctx, 1);
| ^~~~~~~~~~~~~~~~~~~~~
Side note: Tried --with-gnutls
to check how that would work, but it seems to be just broken into pieces regardless domain names or other things:
$ openssl s_client -starttls imap -crlf -connect my.server.domain:143
CONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 631 bytes and written 345 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
$ openssl s_client -starttls smtp -crlf -connect my.server.domain:25
CONNECTED(00000003)
00D246DAF67F0000:error:0A000126:SSL routines:ssl3_read_n:unexpected eof while reading:../ssl/record/rec_layer_s3.c:320:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 206 bytes and written 359 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
The configure script is still picking up OpenSSL 3.0 headers, and configuring the build for OpenSSL 3, but the code ends up using OpenSSL 1.1 header files to compile, this is the reason for the compilation error.
I am unfamiliar with Arch's build framework to offer any pointers.
Next things I’ve tried, to no avail:
Capture the output from this during a successful STARTTLS
(with a nonexistent domain) and during a failed STARTTLS
(with one of the “certified” domains):
pids=($(pidof couriertcpd)); strace -f "${pids[@]/#/-p}"
No surprises. Extra accesses to /etc/courier/imapd.pem.my.server.domain
occur in the latter case. A failure in the latter case (in strace
) is not obvious, but there is some log write like this one (also the only “failed
” in that output):
[pid 4175639] write(1, ". NO STARTTLS failed: ip=[2620:0"..., 192 <unfinished ...>
Boldly assuming it’s still the same descriptor, writes that preceded that ↑ were:
[pid 4175639] write(1, "* OK [CAPABILITY IMAP4rev1 UIDPL"..., 339) = 339
[pid 4175639] write(1, "* CAPABILITY IMAP4rev1 UIDPLUS C"..., 255) = 255
[pid 4175639] write(1, ". OK Begin SSL/TLS negotiation n"..., 37) = 37
Now a “successful” case (with a bogus domain) still has these↑ lines, but not the ultimate . NO STARTTLS failed:
line.
TLS_PROTOCOL=TLSv1.2+
instead of TLS_PROTOCOL=TLSv1.2++
: Nothing changed. So the ban on client-initiated re-negotiation is not to blame either.
Splitting key and certificate files (just in case something is wrong with the file accesses): I used to have a TLS_CERTFILE
with the private key (also) in it. Now I have TLS_PRIVATE_KEYFILE
and a certificate-only TLS_CERTFILE
. But again, nothing changed.
In any case, direct TLS without STARTTLS
(other ports, but the same Courier instance + config) works perfectly fine. This is weird. No clue what I’m missing.
Ad Courier and OpenSSL in Arch: This is the ./configure
command. Is there a ./configure
option that could change the SSL include path? OpenSSL 1.1.1s
is installed in /usr/{include,lib}/openssl-1.1
. Whatever I can hack in the sources is easy to test with makepkg
; I’m just clueless as to what to hack.
./configure reads the CFLAGS, CXXFLAGS, et. al., environment variables. They can also be passed in, explicitly, as additional parameters: CFLAGS=... CXXFLAGS=... to configure.
./configure reads the CFLAGS, CXXFLAGS, et. al., environment variables.
Well, I export
ed those (with values listed above) in the PKGBUILD
right before the ./configure
and they did appear in the compiler commands printed out during make
. But then I got that error nonetheless, so there must be something that still looks for OpenSSL in the default system paths or otherwise assumes OpenSSL 3.
I would then double-check the actual parameters that get passed to the compiler. make V=1
builds and show each command that gets invoked, with all the options. The exact options, -I
, and all others, can be ascertained from that.
I don’t think there is an issue with -I
. This looks like a problem during ./configure
, not during make
. The missing OSSL_LIB_CTX
indicates that OpenSSL 1.1.x headers are included (as desired), but the code expects OpenSSL 3.x. To overcome the missing OSSL_LIB_CTX
, this hack is needed:
sed -i \
's/"#define HAVE_PEM_READ_BIO_PARAMETERS_EX 1"/"#define HAVE_PEM_READ_BIO_PARAMETERS_EX 0"/' \
libs/tcpd/configure
Otherwise this gets compiled in and requires OSSL_LIB_CTX
. The function this checks for (in tcpd/configure
) is called PEM_read_bio_Parameters_ex
. It is available in OpenSSL 3.x, but not in OpenSSL 1.1.x.
So while -I
and -L
are set correctly during compilation and linking, the testing code snippets in the ./configure
stage are likely not getting the OpenSSL version override and are using the system default 3.0.7 instead.
Even after hacking around the OSSL_LIB_CTX
requirement in tcpd/configure
it won’t link, due to a missing SSL_get_peer_certificate
:
libtool: link: gcc -I./.. -I.. -I./../.. -I../.. -Wall -I/usr/include/openssl-1.1 -march=native -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z -Wl,relro -Wl,-z -Wl,now -Wl,-L/usr/lib/openssl-1.1 -Wl,-L/usr/lib/courier-authlib -o couriertls starttls.o argparse.o ./.libs/libcouriertls.a -lssl -lcrypto ./.libs/libspipe.a ../rfc1035/librfc1035.a ../md5/.libs/libmd5.a ../random128/.libs/librandom128.a ../numlib/.libs/libnumlib.a ../liblock/.libs/liblock.a -lcourierauth -lidn2 ../soxwrap/libsoxwrap.a
/usr/bin/ld: ./.libs/libcouriertls.a(libcouriertls.o): in function `tls_dump_connection_info':
libcouriertls.c:(.text+0x2bb5): undefined reference to `SSL_get_peer_certificate'
This↑ error is utterly bogus, because that symbol does exist in OpenSSL 1.1.x (and not in OpenSSL 3.x):
$ strings /usr/lib/openssl-1.1/libssl.so | grep SSL_get_peer_certificate # 1.1.1s
SSL_get_peer_certificate
$ strings /usr/lib/libssl.so | grep SSL_get_peer_certificate # 3.0.7
The whole build procedure, for the record:
CFLAGS="-I/usr/include/openssl-1.1 ${CFLAGS}"
CXXFLAGS="-I/usr/include/openssl-1.1 ${CXXFLAGS}"
CPPFLAGS="$CXXFLAGS"
LDFLAGS+=',-L/usr/lib/openssl-1.1,-L/usr/lib/courier-authlib -lcourierauth'
export CFLAGS CPPFLAGS CXXFLAGS LDFLAGS
sed -i \
's/"#define HAVE_PEM_READ_BIO_PARAMETERS_EX 1"/"#define HAVE_PEM_READ_BIO_PARAMETERS_EX 0"/' \
libs/tcpd/configure
./configure --prefix=/usr \
--sbindir=/usr/bin \
--sysconfdir=/etc/courier \
--libdir=/usr/lib \
--libexecdir=/usr/lib \
--localstatedir=/var/spool/courier \
--enable-unicode \
--enable-workarounds-for-imap-client-bugs \
--enable-mimetypes=/etc/mime.types \
--with-piddir=/run/courier \
--with-trashquota \
--with-db=gdbm \
--with-random=/dev/urandom \
--without-ispell \
--with-mailuser=courier \
--with-mailgroup=courier \
--with-certdb=/etc/ssl/certs/ \
--with-notice=unicode \
"CFLAGS=${CFLAGS}" \
"CPPFLAGS=${CPPFLAGS}" \
"CXXFLAGS=${CXXFLAGS}" \
"LDFLAGS=${LDFLAGS}"
make V=1
This is not a correct interpretation.
This is a standard autoconf-generated test:
AC_CHECK_FUNCS(PEM_read_bio_Parameters_ex)
configure attempts to link with a dummy program that calls this function. If the link succeeds, #define HAVE_PEM_READ_BIO_PARAMETERS_EX 1
gets defined. If the link fails, this is not defined.
The environment which runs the configure script ends up with the linker finding the OpenSSL 3 library. But the compilation environment is pointing to OpenSSL 1.
The environment which runs the configure script ends up with the linker finding the OpenSSL 3 library. But the compilation environment is pointing to OpenSSL 1.
That doesn’t seem to contradict what I said above: The ./configure
script basically ignores my attempts to override -I
and -L
for OpenSSL when running its decision-making code snippets and keeps using the default OpenSSL there.
OTOH, ./configure
does include my explicitly specified -I
and -L
in the generated Makefile
s.
That↑ way the make
phase (indeed) has the correct (overridden) -I
and -L
for OpenSSL 1.1.1s, which mismatches ./configure
’s decisions taken based on OpenSSL 3.0.7 (and written into config headers).
What is the best place to forcibly inject my -I
and -L
(also) into ./configure
script’s “dummy programs” (auto-generated snippets) compilation, so that OpenSSL 1.1.x is used there too?
I think it's a matter of using the right environment variables.
AC_CHECK_FUNCS appear to compile the test program as C, using CFLAGS, CPPFLAGS, and LDFLAGS.
This is a matter of strictly enforcing the right variables: -I goes into CPPFLAGS. -l, -L goes into LDFLAGS.
Both C and C++ compilations use CPPFLAGS, for preprocessor-related compiler flags, and LDFLAGS for linker flags.
Stuffing everything into CXXFLAGS is just the lazy way out that works most of the time. Except when it doesn't.
Stuffing everything into CXXFLAGS is just the lazy way out that works most of the time. Except when it doesn't.
What is this↑ referring to? I don’t see that here↓ in my hack; LDFLAGS
and CPPFLAGS
are separate…
CFLAGS="-I/usr/include/openssl-1.1 ${CFLAGS}"
CXXFLAGS="-I/usr/include/openssl-1.1 ${CXXFLAGS}"
CPPFLAGS="$CXXFLAGS"
LDFLAGS+=',-L/usr/lib/openssl-1.1,-L/usr/lib/courier-authlib -lcourierauth'
export CFLAGS CPPFLAGS CXXFLAGS LDFLAGS
CXXFLAGS
is an Arch Linux thing set in makepkg.conf
, which also happens to occur in Courier’s sources (not sure if with the same meaning). So I’m passing it through.
When I unset CXXFLAGS
and remove it also form configure
’s command line (while leaving the rest of the hack around), it makes no difference in terms of errors. The same errors happen with flags minimized like this (i.e. no defaults propagated from /etc/makepkg.conf
):
CFLAGS=
CPPFLAGS='-I/usr/include/openssl-1.1'
LDFLAGS='-Wl,-L/usr/lib/openssl-1.1,-L/usr/lib/courier-authlib,-lcourierauth'
unset CXXFLAGS
export CFLAGS CPPFLAGS LDFLAGS
strace
has just revealed something (with -s 5000
): Courier claims that a successfully read certificate file does not exist. This is my setup on the server:
pids=($(pidof couriertcpd))
strace -s 5000 -f "${pids[@]/#/-p}"
This runs on the client (OpenSSL 3.0.7 on both sides):
openssl s_client -starttls imap -crlf -connect imap.somedomain.org:143
What I see is a process that successfully reads /etc/courier/imapd.pem.imap.somedomain.org
twice in its entirety (4021 bytes), yet claims afterwards that the file does not exist:
[pid 45606] openat(AT_FDCWD, "/etc/courier/imapd.pem.imap.somedomain.org", O_RDONLY) = 6
[pid 45606] newfstatat(6, "", {st_mode=S_IFREG|0440, st_size=4021, ...}, AT_EMPTY_PATH) = 0
[pid 45606] read(6, "-----BEGIN CERTIFICATE-----\n >>> Server’s certificate is read here! <<< \n-----END CERTIFICATE-----\n-----BEGIN CERTIFICATE-----\n >>> CA’s certificate follows here! <<< \n-----END CERTIFICATE-----\n", 4096) = 4021
[pid 45606] read(6, "", 4096) = 0
[pid 45606] close(6) = 0
[pid 45606] openat(AT_FDCWD, "/etc/courier/imapd.pem.imap.somedomain.org", O_RDONLY) = 6
[pid 45606] lseek(6, 0, SEEK_CUR) = 0
[pid 45606] lseek(6, 0, SEEK_CUR) = 0
[pid 45606] lseek(6, 0, SEEK_CUR) = 0
[pid 45606] lseek(6, 0, SEEK_CUR) = 0
[pid 45606] newfstatat(6, "", {st_mode=S_IFREG|0440, st_size=4021, ...}, AT_EMPTY_PATH) = 0
[pid 45606] lseek(6, 0, SEEK_SET) = 0
[pid 45606] read(6, "-----BEGIN CERTIFICATE-----\n >>> Server’s certificate is read here! <<< \n-----END CERTIFICATE-----\n-----BEGIN CERTIFICATE-----\n >>> CA’s certificate follows here! <<< \n-----END CERTIFICATE-----\n", 4096) = 4021
[pid 45606] lseek(6, 4021, SEEK_SET) = 4021
[pid 45606] read(6, "", 4096) = 0
[pid 45606] newfstatat(7, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0
[pid 45606] write(7, "ip=[2620:x:x:x:x:x:x:x], couriertls: /etc/courier/imapd.pem.imap.somedomain.org: No such file or directory\n", 121) = 121
[pid 45606] close(6) = 0
“No such file or directory” for something that has been successfully opened and read…? Next another process, the previous one’s parent over a pipe, generates an IMAP message with . NO STARTTLS failed: ...
:
[pid 45604] write(1, ". NO STARTTLS failed: ip=[2620:x:x:x:x:x:x:x], couriertls: /etc/courier/imapd.pem.imap.somedomain.org: No such file or directory\r\n* NO Error in IMAP command received by server.\r\n", 192 <unfinished ...>
This↑ error does not occur on successful TLS connections, only on failed STARTTLS
. Stating the obvious, the file exists, is a regular file, is readable for courier
, openssl x509
can parse it and has never caused problems … until now.
$ sudo ls -al /etc/courier/imapd.pem{,.imap.somedomain.org}
lrwxrwxrwx 1 courier courier 27 Nov 24 2020 /etc/courier/imapd.pem -> imapd.pem.imap.somedomain.org
-r--r----- 1 courier courier 4021 Dec 16 03:12 /etc/courier/imapd.pem.imap.somedomain.org
$ { openssl x509 -text; openssl x509 -text; } < /etc/courier/imapd.pem.imap.somedomain.org
Certificate: ... both certificates are read just fine ...
What could be causing the bogus “No such file or directory” error?
(Because I’m unable to rebuild Courier with OpenSSL 1.1.1s
(due to the ./configure
problem mentioned earlier), this is the only debugging clue I have at the moment.)
Error handling is a long time design weakness in the OpenSSL API. When an OpenSSL API call fails no specific error indication gets returned, rather the application calls ERR_get_error
to retrieve the last reported library error, and if no error code gets returned then the call must've failed due to the a failed system call, so read errno
.
However if there's a failed API call but there is no error code that gets logged and returned from ERR_get_error
a system error message gets mistakenly logged. Some prior syscall failed with ENOENT. errno
never gets cleared automatically, so a misleading error then gets logged.
Additionally what sometimes happens is that the OpenSSL library changes some of its error codes, and applications that rely on specific error codes break.
The first time the certificate file gets read is by OpenSSL itself, when it gets installed into the SSL context. Courier's code also supports loading custom DH parameters from the PEM formatted file. If it's missing, the expected error code is PEM_R_NO_START_LINE. That's this code in libcouriertls.c:
/*
** If the certificate file does not have DH parameters,
** swallow the error.
*/
int err=ERR_peek_last_error();
if (ERR_GET_LIB(err) == ERR_LIB_PEM
&& ERR_GET_REASON(err) == PEM_R_NO_START_LINE)
{
ERR_clear_error();
}
else
{
sslerror(info, filename, -1);
}
But if this is where the erroneous error gets logged then there should be a pending error code, this only peeks at the error and does not remove it from the error queue.
One way to test this hypothesis is to temporarily replace the call with something like:
sslerror(info, "*** trap ***", -1);
and if this now gets logged instead of the filename then this must be the reason, and OpenSSL changed the error code again. Some extra work will need to be done in order to determine what the error code is, and update the code to check for it.
Another alternative would be to simply add your own DH parameters to the certificate file. There's an internal script in the package, mkdhparams
:
TLS_DHPARAMS=/tmp/dhparams.pem mkdhparams
and the output can simply be concatenated to the certificate file.
Of course all of this presumes that the dh parameter load is the problem here.
Of course all of this presumes that the dh parameter load is the problem here.
Looks like it is. It is not obvious from the strace
though. So, I have TLS_DHPARAMS=/etc/courier/dhparams.pem
. That’s a regular file, owned and readable by courier:courier
. It is regenerated monthly using openssl dhparam -out /etc/courier/dhparams.pem 4096
. (TIL that -rand
is not required any more.) (My keys have 4096 bits and it is recommended to pick an equal length here.) In strace
that file is opened and successfully read ~3 times. No domain-specific name suffixes are probed, so I assume they need not exist; D-H parameters are “global”.
Anyhow: This fixes the problem:
cat /etc/courier/dhparams.pem >> /etc/courier/imapd.pem
TLS worked before and works now. STARTTLS
was failing before (without the concatenation) and works again now.
Phew. Thanks for the pointer. I would have never thought it could be something with the D-H parameters!
TLS_DHPARAMS in the configuration file can be set to point to a discrete DH parameters file.
But it should, in theory, work without it. Something is not working right in OpenSSL. I'll try to reproduce this myself, and see if I can figure it out.
This has now been fixed.
A note on OpenSSL 3.0.8+ for future readers: The workaround must be removed. The trick that helped before (appending the dhparams
to the certificate chain) will now cause all STARTTLS
connections to fail + reset. Keeping the dhparams
as a separate file again (as it should be) restores everything back to normal.
Just upgraded from 1.3.2
to 1.3.4
, which coincides with an OpenSSL upgrade from 3.1.1
to 3.1.4
. The problem is back. :fearful: The symptoms are almost exactly the same.
In the IMAP case the error on s_client
side is (sometimes):
4077B765FD7E0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:ssl/record/ssl3_record.c:358:
In the ESMTP case the error in the logs is (always):
courieresmtpd: STARTTLS failed: ip=[::1], couriertls: /etc/courier/esmtpd.pem.smtp.my.domain: error:1E08010C:DECODER routines::unsupported
Type | Command: openssl s_client -crlf … |
Result | Note |
---|---|---|---|
IMAP + STARTTLS | -starttls imap -connect foo.my.domain:143 |
WORKS | no certificate for subdomain |
IMAP + TLS | -connect foo.my.domain:993 |
WORKS | no certificate for subdomain |
IMAP + STARTTLS | -starttls imap -connect imap.my.domain:143 |
FAILS | certificate exists |
IMAP + TLS | -connect imap.my.domain:993 |
WORKS | certificate exists |
ESMTP + STARTTLS | -starttls smtp -connect foo.my.domain:25 |
WORKS | no certificate for subdomain |
ESMTP + TLS | -connect foo.my.domain:465 |
WORKS | no certificate for subdomain |
ESMTP + STARTTLS | -starttls smtp -connect smtp.my.domain:25 |
FAILS | certificate exists |
ESMTP + TLS | -connect smtp.my.domain:465 |
WORKS | certificate exists |
I’m going to retry the workaround, but it’s also possible that it won’t work any more and the problem is different…
The “FAILS” entries above are now working again after I appended the stuff -----BEGIN DH PARAMETERS-----
… -----END DH PARAMETERS-----
to my certificate file.
All my configs have basically these files set:
TLS_CERTFILE=/etc/courier/esmtpd.pem
TLS_PRIVATE_KEYFILE=/etc/courier/esmtpd.key
TLS_DHPARAMS=/etc/courier/dhparams.pem
But as part of the workaround, the TLS_CERTFILE=/etc/courier/esmtpd.pem
now looks like this:
-----BEGIN CERTIFICATE-----
<<< my certificate >>>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<<< LetsEncrypt’s intermediate certificate >>>
-----END CERTIFICATE-----
-----BEGIN DH PARAMETERS-----
<<< The stuff from /etc/courier/dhparams.pem >>>
-----END DH PARAMETERS-----
I’m not quite sure what’s happening here… Would you consider reopening this or should I file a new bug? Is there anything I can do to help with debugging? Can this be an upstream OpenSSL bug like last time?
For bisecting it would be nice to have a minimalistic thing that exhibits the problem but does nothing else. Like a tiny ping server based on couriertls
. Can this be done? Can one run it with something like just four mkfifo
pipes (plain, encrypted) × (input, output) to check the basics?
I tried to reproduce this with OpenSSL 3.0.9 and I was unable to reproduce this. I'll keep trying.
This is copied from my rant on AUR.
Not sure if this is caused by version 1.2 of
courier-mta
or version 3.0.x ofopenssl
, butcourier-mta
currently hasSTARTTLS
inoperable unless you connect to the server using a domain name that mismatches the one in the certificate(s) (which makes little sense, i.e.STARTTLS
is basically inoperable). The bug is tricky, because:STARTTLS
This↑ can be reproduced using (1) Thunderbird, (2) R2Mail2 and (3)
openssl s_client
as a client. It affects both IMAP and SMTP. Fors_client
in particular, this is how you can test your server:The error symptom is either an abrupt connection termination with no further output or, sometimes, this error:
The leading string seems random, the stuff after
:
is stable.For easier end-to-end debugging, I’ve used a trivial IMAP client. It establishes a
STARTTLS
connection to an IMAP server, authenticates usingcram-sha256
and reads the mailbox status. As already mentioned, settingserver
to a domain name listed in the certificate fails and setting it to a bogus domain that resolves to the mail server’s IP address (but is not in its certificate) succeeds.This looks like a critical bug, because it renders opportunistic
STARTTLS
security over SMTP’s port 25 inoperable. TLS on 465 works perfectly fine. For IMAP the obvious workaround is to use IMAP over TLS on 993 and give up onSTARTTLS
entirely.I’ve tried to rebuild and restart
courier-mta
, with and without Arch’sopenssl-1.1
package installed (and with the defaultopenssl
3.0.7
always installed), but there is no difference;STARTTLS
is (kind of) gone.