fyiorgnz / alaveteli

OIA/LGOIMA (Freedom of Information) request system: New Zealand fork of mysociety/alaveteli
https://fyi.org.nz/
Other
4 stars 3 forks source link

Stop backscatter from Alaveteli #8

Open olineham opened 5 years ago

olineham commented 5 years ago

The default behaviour of Alaveteli is to send fake bounces to closed requests. (Fake because Alaveteli has no way to know the true sender.) i.e. backscatter. This is unacceptable to us at it ruins our mail reputation and hence overall deliverability (as well as being bad internet citizens).

In the short term this has been eliminated by updating all requests in the database from bounce (the column default) to holding_pen at the expense of a large increase of requests to deal with in the holding pen.

There are a few ways we could deal with this in the long term, to be discussed below.

See also upstream issue https://github.com/mysociety/alaveteli/issues/217

olineham commented 5 years ago

I think it's worth outlining high level requirements. Feel free to edit this. See also comment.

R1. We must not backscatter

R2. Real responses must never be lost silently

They must be:

R3. The holding pen volume must be manageable.

Sending everything to holding_pen instead of fake bouncing is not manageable.

Note: Invalid recipient aliases (e.g. no such request ID or invalid hash) may be routed to holding_pen (like today) or rejected by the MTA, but this is outside the scope of this issue as we never did fake-bounce these. See #9. Most of these are spammers who mis-OCR'd a PDF but significant number are typos by authorities.

olineham commented 5 years ago

Sources of spam and anti-spam measures

Since approx 2016, spammers started more frequently harvesting email addresses from image PDFs using OCR. This is now the majority of our spam.

To a lesser extent, spammers have got hold of our addresses through compromise of authority systems. Prior to about 2016 this was the more common source of our spam and thus it was infrequent. Holding pen was an acceptable approach then.

We should consider if completely different approaches to anti-spam might be effective. For example, DNSBLs to reject the worst mail at SMTP time, Spamassassin to set scores to help decide on routing to real requests or holding pen.

olineham commented 5 years ago

Regarding R2, rarely but importantly, sometimes a very old request receives a real response after eventual intervention by the Ombudsman. It's important these responses either end up in holding_pen or have an effective bounce message to the sender.

Regarding R2 (ii): I'm not sure how we can send a full and readable explanation of the situation at SMTP time. At best we'd be able to send something in a 5.5.4 response, but would this communicate clearly enough to the authority that they need to contact us to ask for the address to be re-opened?

nigeljonez commented 5 years ago

Assigning this, target will be get a proper solution running when we update our instance.

Of note: