Open mattfbacon opened 5 months ago
The addresses in the To/Cc/(Bcc) headers of a message (in Internet Message Format) can be unrelated to the addresses a message is delivered to (over SMTP, with the MAIL FROM and RCPT TO commands).
If you would send an email to someone, and bcc to another address, than that other address also wouldn't see their address in the message headers. Same for forwarding and mailing lists. So marking a message as junk because your address isn't an addressee in the header isn't necessarily a good idea.
For messages without your address in an addressee header, that could be useful signal for junk filtering. But mox doesn't currently use it as such.
You could look at the x-mox-reason header and see if there's hint why the message isn't rejected.
Oh, I thought a Bcc was like a forward. That makes sense now. Maybe I can add this as a spam signal. Can you give some pointers?
Heh, as I was looking up and started writing whereabouts a change may be useful, but it looks like it actually is already implemented: https://github.com/mjl-/mox/blob/2bb4f78/smtpserver/analyze.go#L410
Now the question is: is this working properly, and has it been applied to this message? I.e. has the junk filter threshold been lowered? In that case, the contents apparently were very hammy since the message passed the check. Could you see in the logging if there's anything about the analysis of this message? And does the x-mox-reason header on the message say "no-bad-signals"?
The reason header says no bad signals. Log level is not high enough to see anything about the specific message. The message itself is very spammy, eg contains keywords like "#1", high link density, etc.
If the To/Cc case was hit, it would be logged under Info level. Unless that would result in too much logging, I think would be a good log level for servers. See https://github.com/mjl-/mox/blob/v0.0.10/smtpserver/analyze.go#L416.
I'm also sometimes getting junk messages, and I can't easily see why there are getting through (and I don't always want to dive into server logs). I'm now planning to add some more details to the X-Mox-Reason header. It's much easier to take a quick look. Details like the bayesian filter threshold we arrived at. How many words were used in checking, perhaps scores as well. Will hopefully show a pattern.
I would appreciate that. I also receive these regularly, about once a week, so I will see that as well.
I get some emails like this:
Why? Shouldn't these be rejected by the mail server? And how can I make the mail server reject them?