Fog user's hostname in the Message-ID

oxzi commented 6 years ago

The Message-ID-field of an outgoing email contains the user's hostname, set by the email client. This hostname can reveal both the hostname (obvious) and furthermore the network's domain name. For example this field says if I am currently at home, at work, at the university and so on.

This PR intends to replaces the hostname part with the server's FQDN.

r-raymond commented 6 years ago

Awesome, thanks for the PR!

tokudan commented 6 years ago

The Message-ID is supposed to be globally unique (https://tools.ietf.org/html/rfc5322#section-3.6.4). that's probably no longer guaranteed to be if modified this way and thus violates RFC. Also modifiying the Message-ID is supposed to only happen if the message itself changes. This can actually confuse clients that assume that the message is sent with the same ID that it is stored e.g. in the Sent folder. I assume this will break Outlook, that has a feature to delete "recalled" messages. I'm not sure how exactly this works, but I guess Outlook sends another e-mail telling the receiving client to please delete the email with the ID xyz. Which obviously will be unable to identify the message due to the changed ID. For mutt this could throw off threading and duplicate messages.

Please make this optional.

r-raymond commented 6 years ago

I was not aware of this problem, thanks for pointing this out. If it messes with mutt it definitely will be optional (I use mutt :)) Is there a way to remove the sensitive information without losing the Message IDs?

oxzi commented 6 years ago

@tokudan Thanks for addressing the issue regarding the uniqueness of the Message-ID. I was aware of this topic but would not really address this as a problem, because Simple Nixos Mailserver addresses smaller installation and each mail client generates a (hopefully) unique id by itself. I wouldn't fear any collisions. Furthermore one can not guarantee that not two different clients will not generate the same Message-Id, including the hostname-part.

However, it is the first time hearing about the problems with a changed Message-ID. As far as I knew other mail provides are also changing the hostname part and I was not aware that this can cause problems. Do you have any sources/bug reports regarding changed Message-IDs? Btw, I am using Claws Mail with this PR deployed without any problems.

@r-raymond It seems like the Travis build has timed out. Can you manually restart them?

tokudan commented 6 years ago

Best solution is obviously to get a decent client like mutt (which can set "hostname" in its config to generate the ID). ;) Next if the Message ID leakes data and it's important to take care of that: just delete the ID. The first MTA that sees it (which should be postfix), should generate a new compliant message id. This should be tested first, of course. I'm not sure when the header would be removed in the process and the check for the message id happens.

Just to be clear: The downsides of messing with the message id will mostly show in edge-cases. Having this as an option to increase privacy at the cost of having some bad edge-cases is ok for me. But because it is violating RFC it should be an option. Example for mutt where the message id needs to match the sending message id: http://promberger.info/linux/2008/03/31/mutt-delete-duplicate-e-mail-messages/ http://www.mutt.org/doc/manual/#duplicate-threads

The following postfix options might be relevant: http://www.postfix.org/postconf.5.html#always_add_missing_headers http://www.postfix.org/postconf.5.html#local_header_rewrite_clients

[Edit: removed a stupid comment, sorry]

r-raymond commented 6 years ago

@geistesk The tests are failing right now,since I've transitioned them to Nixos 18.03. I'll try to get them working again before I merge any PRs, so we have a few days :)

phdoerfler commented 6 years ago

Yay a PR! Awesome!

However I'm with @tokudan here. I love privacy but the message ID is quite relevant. In addition of violating the RFC (!) a number of mail clients such as Outlook or Apple Mail group mails together that form a conversation,, rspamd also uses the message ID to memorise what mail was already classified. Changing it on the server risks your spam service classify it twice if you change it at the wrong point in time. Speaking of spam: spamassasin also factors in having an RFC-conformant message ID. To sum this up, messing with the message ID is not a good idea and if you don't know what you're doing (I certainly don't) it may cause more problems than that little bit of privacy it may give. I mean yes, I see the point of someone being able to tell if you were at university or work or some other place, but really, who cares if there's your hostname in there? It's not even wrong. And it depends on the mail client you are using anyway.

Instead get a mail client that generates a sane message ID on its own. There is no reason for the user's hostname to be in there in the first place. But if your client decides that's the message's ID then that's how it's gonna be. Changing it is like changing a person's name, their identity.

Changing the ID afterwards after the mail has already been exposed to the world is not a solution, but a hacky way to work around a bad mail client. So please make this an opt-in option for fixing stupid clients.

In fact even without this PR message IDs are looking good already. This is from a mail I sent to myself using Apple Mail just now:

Message-Id: <D71C45CD-7FE2-486C-BF37-F919CC3036AE@example.com>

with example.com being the domain my mailserver runs at while the computer I sent this from is in a different domain. So I wonder, @geistesk what mail client are you using that this is necessary? Also I don't mean any offense. Just want to make sure we're all on the same page here. If mail providers are commonly changing the ID then maybe it's less a problem than I think. As I said, I certainly no expert when it comes to mail hosting.

Edit: Another good reason not to change the ID: When you send a message, the client normally places a copy of it in a "Sent" folder. The message already has a message ID then. Changing it on the server after the fact means that it diverges from the ID your mail client knows the message as. So there are now two mails in the world that are identical except their IDs. Thus the ID no longer identifies the message and that is its whole purpose.

r-raymond commented 6 years ago

We can discuss the default option but I think it should definitely be included. I just checked and while one can modify the hostname in mutt, the default leaks the information.

oxzi commented 6 years ago

@tokudan, @phdoerfler: Thanks for your justified objections.\ @r-raymond: And thanks for your support.

The first MTA that sees it (which should be postfix), should generate a new compliant message id.

The tests in tests/extern.nix are using msmtp to send mails which does not add a Message-ID by default. This results in received mails without this field. I'm not sure how this changes if the mail leaves the server, but it seems like postfix does not add it by default/in this configuration. I stumbled about this while adapting the tests for this PR.

But because it is violating RFC it should be an option.

I am totally on your side with this but I am not sure if this has to result in a violation. It seems like some mail clients, like @phdoerfler mentioned, are already setting the hostname part of the Message-ID to the domain/FQDN. As far as I understood this could also result in two identical IDs. Honestly, I do not know how to ensure unique Message-IDs if each client generates them.

I love privacy […].

While working on the PR yesterday, I found a nice list which maps Message-ID regexs to their mail client. It seems like most clients are doing this different and one can determine the software by simply looking at the left hand side. Sadly, I haven't found this list, but another related page. It seems like it is almost impossible to get privacy with emails. Not to mention, that deleting all fields would also be noticeable.

So please make this an opt-in option for fixing stupid clients.

On the one hand this sounds like a fix against gassy clients but on the other hand this would also interfere with other clients and may cause the problems pointed out above. Afaik it is not possible to use this only for some loginAccounts.

So I wonder, @geistesk what mail client are you using that this is necessary?

As written above, I am using Claws Mail. However, it is quite easy to configure Claws Mail to use your email's domain as the host part:\ Configuration → Edit Accounts → for all accounts click Edit → select the Send tab → check Send account mail address in Message-ID

At this point I am not sure if this PR was a good idea and perhaps we should better just close it. Making this an option could help some people but could also mess a lot of things up, if used wrong. What is your opinion?

tokudan commented 6 years ago

The tests in tests/extern.nix are using msmtp to send mails which does not add a Message-ID by default. This results in received mails without this field.

I guess that's due to always_add_missing_headers defaulting to "no". See my comment above with the link to the postfix docs. It should probably only be enabled for authorized clients in the submission options.

Having an option to resetting/rewriting the Message-ID is good in my opinion. The documentation should state that it may cause issues with some features of mail clients, so people can make an informed choice there.

tokudan commented 6 years ago

It seems like some mail clients, like @phdoerfler mentioned, are already setting the hostname part of the Message-ID to the domain/FQDN. As far as I understood this could also result in two identical IDs. Honestly, I do not know how to ensure unique Message-IDs if each client generates them.

That actually depends on their algorythm to create the left part. If that's something like https://en.wikipedia.org/wiki/Universally_unique_identifier then it's good enough and doesn't matter what's behind the @. If the left part is just a random number or the number of seconds since the unix epoch, then it's not good enough. That also ties in why I objected to just replace the hostname part. If the left part is weak, there is actually a chance on a very busy mailserver for duplicates. I expect mail clients that use something other than their own hostname to default to a strong algorythm there, due to the risk of duplicates.

oxzi commented 6 years ago

I updated the PR and introduced the rewriteMessageId option to replace the Message-ID's hostname.

r-raymond commented 6 years ago

Thanks for all the work you put into this!

r-raymond / nixos-mailserver

Fog user's hostname in the Message-ID #114