vexim / vexim2

Virtual Exim 2
Other
70 stars 47 forks source link

Do not add spam headers during ACL phase so we don't have to remove them later #203

Closed rimas-kudelis closed 8 years ago

rimas-kudelis commented 8 years ago

Instead, persist $spam_* variables as $acl_m_spam_* (except $spam_score_int, which persists by itself), and use these variables to add headers at delivery time. This will add X-Spam-Score, X-Spam-Score-Int and X-Spam-Bar headers regardless of whether or not the message is spam, but this can be easily changed.

Comments welcome.

rimas-kudelis commented 8 years ago

I did some whitespace shuffling in vexim-acl-check-content.conf, so you may want to use ?w=0 when looking at the diff. Sorry. I can split this into two commits, but these whitespace changes aren't that big after all, so I was like "meh, nevermind". :)

Udera commented 8 years ago

And you forgot to add spam-headers in the router ditch_spam where this is of special interest to debug false positives.

rimas-kudelis commented 8 years ago

Indeed, added them now.

rimas-kudelis commented 8 years ago

PR updated

rimas-kudelis commented 8 years ago

Huh, I just realized this is 99% same as @Z3po's proposal in #102... :)

rimas-kudelis commented 8 years ago

I've added a new commit to the PR, beware. :)

Udera commented 8 years ago

I started testing (only mailbox, no forwards yet). I'm not sure when we should add the X-Spam-Flag: YES. We have the sa_tag (default: 2) and the sa_refuse (default: 5), currently we are adding it for all score > sa_tag.

Spamassassin itself uses a threshold of 5. A mail with score 3:

Aug 28 23:31:46 freebsd spamd[52421]: spamd: clean message (3.0/5.0) for spamd:58 in 0.5 seconds, 483 bytes. 

And if you feed spamassassin the mail:

Return-path: <mail@example.org>
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on freebsd
X-Spam-Level: ***
X-Spam-Status: No, score=3.0 required=5.0 tests=D_TAG autolearn=no
    autolearn_force=no version=3.4.1
....

If I take the wording literally: score > sa_refuse -> mail server refuses mail at delivery sa_refuse > score > sa_tag -> mail server moves mail in spam folder score < sa _tag -> mail is delivered

How I use it currently: score > sa_refuse -> mail server moves mail in spam folder sa_refuse > score > sa_tag -> mail is delivered, the spamassassin tests are added in the header score < sa _tag -> mail is delivered

rimas-kudelis commented 8 years ago

SpamAssassin threshold of 5 is just a default setting as well. Since our virtual users may have different tag and refuse numbers, and Exim cannot pass any of that information to SpamAssassin, we get to ignore whether or not SA thinks an email is actually spam, and instead have to rely on just the score. So it is quite possible to have X-Spam-Flag: YES in combination with X-Spam-Report: No... or vice-versa with Vexim, due to the way SA protocol currently works.

About the way things should work with regard to our settings: until moving to Spam folder was introduced, it was easy: score > sa_refuse -> mail was disarded, score > sa_tag -> mail was marked by adding X-Spam-* headers. Now the situation is different, and indeed it's not so self-evident anymore. Maybe we should rethink our strategy. One of the options that comes to mind is dropping sa_refuse for good, and instead using sa_tag as a threshold of delivering the message into the Spam folder. We could also have a boolean setting which would control whether or not X-Spam-* headers must be added to all messages. The UI could look somewhat like this:

Mark as Spam score                    [________]
   Messages scoring the amount of points specified above or more will be delivered 
   to the user's Spam folder.
   NOTE: regardless of this setting, the server will refuse delivery of messages
   scoring 15 or more points.

Put Spam info headers in all messages [_]
   If checked, X-Spam-Score and other related headers will be present in all messages
   delivered locally. Otherwise, most of these headers will only be present in messages
   considered as spam according to the setting above.

All of this doesn't block this PR though, so I suggest we take this part of discussion elsewhere

Udera commented 8 years ago

I did some further testing. When you use forwarding addresses, and you activate spam checks on the forwarding address and the mailbox where it is finally delivered, identical headers are added. I tried to set a condition, that headers are only added when not present (def:h_x-spam-flag) but it seems not to work on headers that were added in the router-section.

I haven't tried yet to use own variables in that part. Or we don't add headers when the forwarding address is on the server as well (how are we going to check this, what if there is a chain of forwards?

Udera commented 8 years ago

@rimas-kudelis also good ideas. Why using two spam-levels at all. But we can put this in a different issue.

rimas-kudelis commented 8 years ago

I did some further testing. When you use forwarding addresses, and you activate spam checks on the forwarding address and the mailbox where it is finally delivered, identical headers are added. I tried to set a condition, that headers are only added when not present (def:h_x-spam-flag) but it seems not to work on headers that were added in the router-section.

I haven't tried yet to use own variables in that part. Or we don't add headers when the forwarding address is on the server as well (how are we going to check this, what if there is a chain of forwards?

We could move headers_add statements to the transports then. Although given that both headers would be identical, I guess it's not a big issue either way.

rimas-kudelis commented 8 years ago

@rimas-kudelis also good ideas. Why using two spam-levels at all. But we can put this in a different issue.

Another option: keep two levels, but make sa_tag mean the level at which the message is moved into the spam folder, and sa_drop mean what it meant previously – the level at which the message is discarded. To disable any of these two actions, the user could set the threshold to a ridiculously high value (e.g. 999). Or perhaps zero. Or we could have checkboxes next to them. Oh wait, we already have a checkbox controlling both of these... I suppose this should be enough then.

Udera commented 8 years ago

I'd prefer to reject the message. We discussed this before, so with several recipient, we would need to calculate the max of all of the recipient sa_refuse-score. We already started to do some exim-magic ...

On 2016-09-01 16:04, Rimas Kudelis wrote:

@rimas-kudelis [1] also good ideas. Why using two spam-levels at all. But we can put this in a different issue.

Another option: keep two levels, but make sa_tag mean the level at which the message is moved into the spam folder, and sa_drop mean what it meant previously – the level at which the message is discarded. To disable any of these two actions, the user could set the threshold to a ridiculously high value (e.g. 999). Or perhaps zero. Or we could have checkboxes next to them.

You are receiving this because you commented. Reply to this email directly, view it on GitHub [2], or mute the thread [3].

*

Links:

[1] https://github.com/rimas-kudelis [2] https://github.com/vexim/vexim2/pull/203#issuecomment-244088789 [3] https://github.com/notifications/unsubscribe-auth/AGCFFDypdcyrkfjFKxo1KWLVXZ7qs2Kgks5qlttQgaJpZM4JoLX7

rimas-kudelis commented 8 years ago

Rejecting by the maximum threshold for all users is cool. The bigger question is what to do when the score is less than max, but above user's threshold. We used to drop such messages, and I think it still makes sense to let the user choose to drop them.

Udera commented 8 years ago

We could move headers_add statements to the transports then. Although given that both headers would be identical, I guess it's not a big issue either way.

That would only add X-Spam-* headers to mailboxes and not on forwarded mails.

I found a different solution, I used this router (spam-router would also need a rework):

  headers_add = ${if and { \
                    {match{$domain}{$original_domain}} \
                    {match{$local_part}{$original_local_part}} \
                    {>{$spam_score_int}{${extract{sa_tag}{$address_data}}}} \
                    {eq{1}{${extract{on_spamassassin}{$address_data}}}} \
                    } {X-Spam-Flag: YES\nX-Spam-Scoree: $acl_m_spam_score\nX-Spam-Report: $acl_m_spam_report}{} }

Now the settings of the original incoming email address are preferred.

rimas-kudelis commented 8 years ago

I don't think we want X-Spam-* headers in forwarded emails, for exactly same reason as we don't want them in emails we get.

rimas-kudelis commented 8 years ago

That said, I still prefer setting these headers in the router, not transport. :+1:

Udera commented 8 years ago

We could put in a condition that they are only added for local accounts and not for redirects.

Udera commented 8 years ago

I don't think we want X-Spam-* headers in forwarded emails, for exactly same reason as we don't want them in emails we get.

Question is, if it would be interesting to have different spam-limits on local redirects. For external mails we would need to remove all these headers (or block the mail if it is considered spam).

rimas-kudelis commented 8 years ago

I've updated the PR. Squashed most commits, added changes to headers_add, turned the name of the spam report header into a macro. My editor cleaned up some whitespace in process, I've put these changes into a separate commit.

Udera commented 8 years ago

Good. From my POV we can merge it. Or do you want to finish the discussion about the final spamassassin implementation?

rimas-kudelis commented 8 years ago

No, I'm good.