mjl- / mox

modern full-featured open source secure mail server for low-maintenance self-hosted email
https://www.xmox.nl
MIT License
3.55k stars 100 forks source link

Why do I receive emails with other adresses in the To field #157

Open mattfbacon opened 5 months ago

mattfbacon commented 5 months ago

I get some emails like this:

An email where the To field does not match my address

Why? Shouldn't these be rejected by the mail server? And how can I make the mail server reject them?

mjl- commented 5 months ago

The addresses in the To/Cc/(Bcc) headers of a message (in Internet Message Format) can be unrelated to the addresses a message is delivered to (over SMTP, with the MAIL FROM and RCPT TO commands).

If you would send an email to someone, and bcc to another address, than that other address also wouldn't see their address in the message headers. Same for forwarding and mailing lists. So marking a message as junk because your address isn't an addressee in the header isn't necessarily a good idea.

For messages without your address in an addressee header, that could be useful signal for junk filtering. But mox doesn't currently use it as such.

You could look at the x-mox-reason header and see if there's hint why the message isn't rejected.

mattfbacon commented 5 months ago

Oh, I thought a Bcc was like a forward. That makes sense now. Maybe I can add this as a spam signal. Can you give some pointers?

mjl- commented 5 months ago

Heh, as I was looking up and started writing whereabouts a change may be useful, but it looks like it actually is already implemented: https://github.com/mjl-/mox/blob/2bb4f78/smtpserver/analyze.go#L410

Now the question is: is this working properly, and has it been applied to this message? I.e. has the junk filter threshold been lowered? In that case, the contents apparently were very hammy since the message passed the check. Could you see in the logging if there's anything about the analysis of this message? And does the x-mox-reason header on the message say "no-bad-signals"?

mattfbacon commented 5 months ago

The reason header says no bad signals. Log level is not high enough to see anything about the specific message. The message itself is very spammy, eg contains keywords like "#1", high link density, etc.

mjl- commented 5 months ago

If the To/Cc case was hit, it would be logged under Info level. Unless that would result in too much logging, I think would be a good log level for servers. See https://github.com/mjl-/mox/blob/v0.0.10/smtpserver/analyze.go#L416.

I'm also sometimes getting junk messages, and I can't easily see why there are getting through (and I don't always want to dive into server logs). I'm now planning to add some more details to the X-Mox-Reason header. It's much easier to take a quick look. Details like the bayesian filter threshold we arrived at. How many words were used in checking, perhaps scores as well. Will hopefully show a pattern.

mattfbacon commented 5 months ago

I would appreciate that. I also receive these regularly, about once a week, so I will see that as well.