Open posativ opened 11 years ago
You could give bogofilter a try as well.
What about using an external service for this (like akismet)?
External services can be implemented if they are not required for Isso (Akismet is a US service). Nevertheless, spam filtering should by default not rely on any third-party provider (as it defeats the purpose of self-hosting).
Yeah. I meant as an option, not as default (of course).
Perhaps something along the lines of this wordpress plugin http://web-profile.com.ua/wordpress/plugins/anti-spam-pro/
Simply puts a question that you have to answer in case javascript is disabled... otherwise you don't even notice it
This plugin is quite useless as it fails when bots begin to interpret JS (which is not that hard, but requires some compution power I guess). But similar to the plugin, Isso is currently not affected by spam (my demo site does not receive spam e.g.) because most bots are not capable of evaluating JavaScript and if they do, they hopefully abort the computation because PBKDF2 takes too long.
However, a targeted attack which uses the pubic API might be an issue someday.
I’m considering Isso (or Discourse) as an option to enable comments again on my Pelican blog.
I agree with @posativ that because of the matter of trust, this should be self-hosted. What I’m still worried about is how much a spam filter would hammer my poor little ARM server.
The reason why I disabled comments (and moved from the otherwise very nice Habari) is that spammers would effectively DoS my server, since the spam filter (Bayesean + honey pot) would just consume way too much CPU time.
If spam filtering can be done in a not too expensive local way or in a distributed way that can be trusted, I would be very happy to have comments (very likely with Isso) enabled again.
I am not aware of any (real) spam, neither in my personal blog nor in the demo. @noqqe reported that he didn't receive a single spam comment in over a year, too.
That's probably because of Isso is still quite unknown and is written in JavaScript instead of a pre-rendered HTML snippet. Or the Js interpreter of typical spam robots is broken.
Hi, I'm interested on contributing spam filtering. Captcha systems aren't enough since there are paid people doing manual spam, not only bots (personally, in my old blog I received a lot of spam, sometimes coming from real humans).
About the spam filtering system, we can use "support vector machines" instead of bayesian filters, which are relatively efficient after the training phase, so DOS attacks are improbable to be successful.
Lately my site (running isso) has started to receive quite the onslaught of spam. I don't know how much is human-posted and how much is from javascript-aware automation stuff but either way, some of them are even gloating about "easiest captcha ever" on their comment, as if their comment is going to be seen by the public or would be indexed by a search engine. (Also why the hell haven't bots figured out that everyone has used rel="nofollow"
for like 20 years?)
Anyway. Yeah. A plugin system would be great. I"m willing to spend some time working on one.
External services can be implemented if they are not required for Isso (Akismet is a US service). Nevertheless, spam filtering should by default not rely on any third-party provider (as it defeats the purpose of self-hosting).
Would be great to have a plugin system to integrate with third-parties. As an alternative to Akismet, there is OOPSpam which is GDPR complaint.
Anyway. Yeah. A plugin system would be great. I"m willing to spend some time working on one.
@fluffy-critter did you eventually come up with something?
I haven't had time/energy to work on anything, unfortunately.
Hey guys: consider PoW as a simpler means of spam filtering:
I haven't had time/energy to work on anything, unfortunately.
No worries, I was just curious. This is not too important anyway.
In general, a plugin API would be neat to have. Extending the signals system to trigger spam detection upon a new comment should be my idea.
Hey guys: consider PoW as a simpler means of spam filtering:
* https://git.sequentialread.com/forest/pow-captcha * https://mcaptcha.org/
Those two look interesting, but heads up, they require modern browsers and wasm support.
Also I'm not sure what problem that actually solves, beyond making sure someone's idle on a page for a certain amount of time before they submit. Most of the spam I get appears to be submitted by humans who are paid money to defeat CAPTCHAs, as has been the case with most comment spam for at least the past decade.
It would cut down on some spam, probably not all. You could increase the difficulty most likely in the settings, etc. It's simple to setup. There are tradeoffs with everything. If you really wanted to prevent all spam (and also some legitimate comments), you could charge micropayments over the lightning network. :P
Hi, I just heard about this project and it looks nice! In the past I made my own project that was similar and I created the git.sequentialread.com/forest/pow-captcha
for it. If you would like to try it out, it's hosted here:
https://sequentialread.com/now-with-comments/#sqr-comment-container
I have also seen spam from humans. There were some SEO spammers trying to register accounts on our gitea server and post links to their clients businesses. We were not able to stop them until we implemented a required invite token for registration :(
To be honest I'm not sure what to do about that kind of spam besides putting the comments into a moderation queue and having someone look at them.
My "pow-captcha" is not actually a captcha at all, I think it's just a bot deterrent. Unfortunately its also a deterrent for people who run customized browsers with anti-fingerprinting or new features disabled. I use it as bot deterrent for other things and I have seen my friends blocked by it because the privacy browser they use on their phone would not allow WebWorkers etc :(
I think unfortunately spam is always going to be impossible to stop automatically with high accuracy. Maybe for the next version of my site I will try out isso for comments but make it skip the moderation queue if the browser was able to solve the pow challenge.
making sure someone's idle on a page for a certain amount of time before they submit.
The animated gif on the ReadMe is sort of intentionally slowed down / it was recorded from a higher difficulty setting than the one I use on my site. I think you would have to pull out a cell phone from 8-10 years ago to see it go that slow on my site. On a new computer its so fast that it can barely even render the progress bar before its done.
Yeah there's no way to automatically get rid of all spam, but it's still nice to have a means of being able to classify things to apply different moderation policies to them, and possibly be able to specifically whitelist known-good posters so their comments go up immediately.
It's surprisingly difficult to find a usable general spam filter software. That's currently available:
Probably DIY: http://crm114.sourceforge.net/docs/classify_details.txt