marlam / msmtp

SMTP client with sendmail compatible interface
https://marlam.de/msmtp
GNU General Public License v3.0
176 stars 36 forks source link

Check for SMTPUTF8 capability and include that in MAIL command #181

Open mlt opened 1 month ago

mlt commented 1 month ago

We got to use punycode at least for domain part. I saw a bunch of RFCs talking about local part and downgrading as well as a proposed use of SMTPUTF8 extension on both sides. What is your take on that?

marlam commented 3 weeks ago

Hm, my understanding of the newer RFCs 6532 and 6531 is that nowadays mail addresses can contain UTF-8 directly, both in the local part and in the domain part, and such UTF-8 addresses can show up in mail headers directly, without the need for punycode or other encodings.

So msmtp should leave these addresses alone.

The error you linked to is because msmtpd does not want to handle UTF-8 currently. It enforces a very strict ASCII-only subset of characters because mail addresses will become part of a command line that is passed to a shell, so there are security considerations. The clean solution would probably be to not use popen() but fork/execve or something similar, but that quickly becomes a great big mess with a lot of potential for bugs, so I'm not sure what to do about it.

mlt commented 3 weeks ago

I personally do not use msmtpd so let it be. It was just handy to use for testing.

RFC 6531 also says in section 3.4 that the client MUST supply SMTPUTF8 in MAIL FROM if and only if it is necessary. I did a quick try with a couple of servers. chasquid (first time saw it) seems to not care about that out of the box. However, Exim (with smtputf8_advertise_hosts = *) did not like it (could be overridden with explicit allow_utf8_domains = true though)

<-- 220 DESKTOP-K26J5U0. ESMTP Exim 4.97 Ubuntu Mon, 28 Oct 2024 10:35:49 -0500
--> EHLO localhost
<-- 250-DESKTOP-K26J5U0. Hello ip6-localhost [::1]
<-- 250-SIZE 52428800
<-- 250-8BITMIME
<-- 250-PIPELINING
<-- 250-PIPECONNECT
<-- 250-CHUNKING
<-- 250-STARTTLS
<-- 250-PRDR
<-- 250-SMTPUTF8
<-- 250 HELP
--> MAIL FROM:<mlt@почта.test>
--> RCPT TO:<postmaster@почта.test>
--> DATA
<-- 501 <mlt@почта.test>: domain missing or malformed

but accepts if MAIL FROM includes SMTPUTF8

<-- 220 DESKTOP-K26J5U0. ESMTP Exim 4.97 Ubuntu Mon, 28 Oct 2024 10:36:30 -0500
--> EHLO localhost
<-- 250-DESKTOP-K26J5U0. Hello ip6-localhost [::1]
<-- 250-SIZE 52428800
<-- 250-8BITMIME
<-- 250-PIPELINING
<-- 250-PIPECONNECT
<-- 250-CHUNKING
<-- 250-STARTTLS
<-- 250-PRDR
<-- 250-SMTPUTF8
<-- 250 HELP
--> MAIL FROM:<mlt@почта.test> SMTPUTF8
--> RCPT TO:<postmaster@почта.test>
--> DATA
<-- 250 OK
<-- 250 Accepted
<-- 354 Enter message, ending with "." on a line by itself
--> Date: Mon, 28 Oct 2024 10:36:27 -0500
--> Message-ID: <09aa28c82db6a0d12e63f5c541391b33@почта.test>
--> From: mlt@почта.test
--> To: postmaster@почта.test
--> Subject: Hello
-->
--> Have a nice day!
-->
--> .
<-- 250 OK id=1t5RnA-000000005rz-41rK

Also in the middle of section 3.2 it says the client should not attempt transmission if the server does not support the extension but it is necessary… and that the client may do something depending on circumstances. I think to fully comply, ideally, msmtp should 1) recognize server's SMTPUTF8 capability 2) indicate the need in MAIL FROM 3) do our best if server is not capable of SMTPUTF8 by using punycode if local part happen to be ascii only

While 1 and 3 are easy, for 2, however, I feel like it would be nice to have some "context" struct to pass around to reduce number of arguments passed around in a few functions. It seems convenient to pre-scan recipients' characters where those are sanitized so by the time capabilities are advertised, we don't have to go over the list of recipients again.

marlam commented 3 weeks ago

This SMTPUTF8 thing is a mess.

Msmtp cannot and should not attempt to find out if the mail requires SMTPUTF8, since that would require parsing all relevant mail headers. And that is so error prone that msmtp was specifically designed never to do something like that.

Instead, what we could do is detect the SMTPUTF8 server capability and if it is present, always send the SMTPUTF8 parameter. Strictly speaking that does not even violate Sec. 3.4 since msmtp is not aware whether SMTPUTF8 is needed or not.

I would not care about ancient servers not supporting SMTPUTF8 beyond not sending the SMTPUTF8 parameter. If such a server rejects a client message because there is some address and/or some header with UTF8 in it, then the user will be notified, and I think that's good enough.

That's the minimal future proof approach, and since msmtp strives for minimality, I think it is sufficient.

mlt commented 3 weeks ago

As a side note, we do not check whether we actually transmit in UTF8 and not something else.