rrrene / elixirstatus-web

Community site for Elixir project/blog post/version updates
http://elixirstatus.com
MIT License
282 stars 44 forks source link

Blacklist against spam #53

Open PhillippOhlandt opened 7 years ago

PhillippOhlandt commented 7 years ago

In the last few days, there were some spam posts on elixirstatus. Someone on Twitter suggested to ban all posts by a certain domain (all spam posts linked to articles on that domain).

Example spam post from a few mins ago:

Want To Hire Best Web And App, Developers? Hire MEAN Stack Developers
Misc Today by  jimmiewilliams |  Retweet this
In the evolving world of full stack development, MEAN stack development is the first option that comes in the mind of startups and enterprises of US, UK, Europe, Australia, Canada and many other leading countries. hiring MEAN Stack Development Company India is a trending choice for the web & mobile app development today. So, are you aware of the nitty-gritty of MEAN stack? Keep reading the blog and at the end, you will be completely aware of all the aspects of MEAN stack Development Solutions.

Check out this Blog: https://excellentwebworld.com/hire-mean-stack-developers/

I think it would make sense to introduce a blacklist feature so spammers are more limited in the future.

rrrene commented 7 years ago

I am absolutely :+1: this.

As a first measure, I would statuses containing certain URLs.

WDYT?

rhnonose commented 7 years ago

A report buttom in the issue page, maybe? You can trick blacklists with link shorteners.

PhillippOhlandt commented 7 years ago

Ok, there are several ways we can approach this.

We could say we create a host based blacklist. Then we can also easily block link shorteners if needed. I think the normal user wouldn't mind if link shorteners are not allowed.

We could also add some content filtering (in the future).

The question is, would that all be hardcoded and adjusted via PRs or does it need a web interface with something like a voting system? A hardcoded blacklist might be fine for the start and if there is demand, a web interface could be added.

I like @rhnonose's idea of a report button, so once a certain threshold is reached, the post will disappear/blocked/deleted. But that would need some sort of security mechanism so it can't be abused. Or it will just fire off an email to @rrrene once the threshold is reached and the removal is still a manual step (BTW I would volunteer as moderator).

rrrene commented 7 years ago

I think a blacklist is a great first step and I agree that it should not be a static one and that we should think about the link-shortener problem (but we can simply follow the 302s to solve that I think).

That said, I think an additional "report button" is a good idea as well!

bruce commented 7 years ago

I'd suggest not blocking link shorteners (identifying them is its own problem, and they're in common use due to character limits); follow the redirect as @rrrene suggests, and I think the reporting feature is a great idea as well (esp if it flags for human moderation, not insta-ban, which can be abused).

rrrene commented 7 years ago

There's now a rudimentary system in place, which allows us to block URLs.

Next step will be a reporting button, which will flag a post for human moderation!

PhillippOhlandt commented 7 years ago

I got another idea. Could we someone validate the users GitHub account? All spam accounts are quite new (a few months to a year) and have no contributions except the sign-up event on GitHub itself. Maybe we can combine this check with a check for the word "elixir" in the post body.

I know, it's not that accurate but posts that match those patterns could be added with a moderation flag and a moderator needs to publish them manually. That way we don't exclude false-positives but prevent the spam from ever being published. And the spammers probably get bored by it and give up.

lawik commented 4 years ago

Newly had some medical spam from https://github.com/riteshpatil36

2 year old account.

Tuxified commented 4 years ago

You could also preemptively "pauze" certain content until someone approves? If a post for example doesn't contain any keywords (Elixir, Erlang, BEAM, Gleam, CRDT, etc) and is the first post from a user?

rrrene commented 4 years ago

@Tuxified We do, in fact, have a system that goes beyond its humble beginnings described above. And it catches most of the spam the site receives (the spammers never gave up).

I am tweaking the rules and try to find a good fit. I have been very hesitant with matching for certain keywords and generating false positives ...

liskin commented 4 years ago

Another wave of spam came through just now. :-(

zorn commented 3 years ago

Are blog articles with job post spam? I feel like ElixirStatus is not an appropriate venue to promote Elixir jobs but curious what others think. Saw a few job posts listed as blog post recently.

rrrene commented 3 years ago

I feel like ElixirStatus is not an appropriate venue to promote Elixir jobs but curious what others think.

Could you elaborate on this?

I initially thought that small Elixir shops hiring is something that could fit in pretty well at a moderate pace. Do you fear that this will become a spamfest for recruiters?

The incorrect designation as "blog post" is definitely something I will have to work on!

zorn commented 3 years ago

I think the BLOG tag is the main source of my immediate concern. If there was a dedicated JOB / EMPLOYMENT tag then the behavior would align better with my personal expectations.

I personally have mixed feels on job post category. Part of me would like to see the feed stay focused on educational content, but I openly acknowledge that the reason many people are consuming this education content is to inevitably leave their non-elixir job for something else they are more excited about -- I do want to see general use of Elixir flourish.

Thanks for all the work you do on the site and listening to my feedback note.

rrrene commented 3 years ago

@zorn I added an "Employment" type:

image

Like you, I want to see more Elixir jobs in the world. Let's see how this category develops. If too many recruiters "spam" positions daily, we'll find a solution like limiting the possible number of "employment" posts per user per month (or something like that). :+1: