DemocracyClub / WhoCanIVoteFor

🗳 The source for https://whocanivotefor.co.uk/
https://whocanivotefor.co.uk/
40 stars 31 forks source link

Substitute akismet package and reconfigure spam filtering #2046

Closed VirginiaDooley closed 1 month ago

VirginiaDooley commented 1 month ago

In this PR, I've swapped the python-akismet spam filtering package for akismet with the (so far unproved) theory that a package that is maintained more regularly/recently would help filter out increasingly sophisticated spam submissions.


coveralls commented 1 month ago

Coverage Status

coverage: 58.192% (-0.01%) from 58.204% when pulling 92ab335b293004abdc01910f40b9fff13c23ae70 on hotfix/improve-spam-filtering into 0252b7c57c1376616ce6defb0672c0b6b6e1d14c on master.

symroe commented 1 month ago

I don't know why changing the client library would change anything, unless we're using a different API version.

However, if this is better maintained then we should switch to it :+1:

When reading this I did think of one thing. The IP address is the only required field, suggesting they weight that highly as a way to detect spam.

We're currently using self.request.META["REMOTE_ADDR"] as a way to give Akismet the IP address, however in a deployed environment this will actually be the IP address of the connecting client. This means it will be the IP address of the ALB (or maybe CloudFront, if the ALB spoofs it's own IP address, I don't know).

I think we want to change this to:

self.request.META['HTTP_X_FORWARDED_FOR'].split(",")[0].strip()

This means we always take the IP of the client that connected to CloudFront, not the IP of the connecting client (ALB or CloudFront). The latter will be a lot more trusted I would have thought, or at least we will be building up trust with them over time as we mark some comments from them as not spam.