NyaaPantsu / nyaa

Nyaa.se replacement written in golang
MIT License
994 stars 148 forks source link

SPAM Filter System #103

Open ghost opened 7 years ago

ghost commented 7 years ago

There needs to be a SPAM filter before enabling upload, to prevent people and bots from uploading garbage into the site.

Here are some ideas for that, along with the moderation system mentioned on #35

The main problem is that this being an OSS project, you'd need to either:

IMO the second solution is best, while also lending your own version of the filters to other (trusted) people for them to host their version of nyaa.

aerojun commented 7 years ago

I think that before uploading, you should have an account.

sdomi commented 7 years ago

1) just as @aerojun wrote, you should have an account before uploading. 2) names that contain phrases like "[HorribleSubs]" should be reserved for only certain users(in this case, official HS account) 3) unofficial batches probably should be prohibited 4) someone on IRC suggested that all torrents expect for those uploaded by trusted users should be checked by a mod, submitted as "OK" by him and then shown on the website.. that would mean a virtually infinite amount of work for mods, but i think we use this idea somehow 5) maybe the system should check for the anime/manga on mal or other anime database to make sure it really exists? this would avoid typos as well as fake stuff 6) i think we should ban software and games section, at least for a while - it wasn't that strong anyways and it's kinda dangerous

Kuraperunat commented 7 years ago

Suggestions 3-6 on redsPL's list are completely senseless. Requiring an account for uploading + reporting system + mods deal with reports with deletions/bans is enough.

You have all the ingredients in your hands to completely mess up this revival with absurd rules and/or stringent conventions. Please think your solutions through.

sfan5 commented 7 years ago

Agree on having an account, though maybe anon uploads should be considered.

names that contain phrases like "[HorribleSubs]" should be reserved unofficial batches probably should be prohibited i think we should ban software and games section

wtf no

all torrents expect for those uploaded by trusted users should be checked by a mod

way too much work, doesn't scale

maybe the system should check for the anime/manga on mal

this is a public tracker we don't need (or want!) quality control except for marking torrents as a+/having trusted users

ghost commented 7 years ago

I royally disagree with needing an account just to upload.

This is a public torrent indexer, it lives and thrives on anonymous uploads, just like old Nyaa and Sukebei did, the latter having more anonymous uploads/account uploads ratio than Nyaa. I mean, this is a combined effort by the anons of both /a/ and /g/ after all.

With the proper SPAM filters plus the Report System i already mentioned, it shouldn't be too much work for the Moderators. In fact, old Nyaa had only 17 moderators, even less when they just started (around 8, 9).

IMO the priority should go:

  1. SPAM filters system
  2. Report and Moderation system
  3. Enable upload
  4. Account and Registration system

Of course, this is after the other major issues the site needs like db conversions and torrent data scrapping, among others.

an-electric-sheep commented 7 years ago

There are two kinds of spam:

  1. something meant to clog up the site. it's targeted at the site itself. to make it less useful or to displace genuine content.
  2. direct malware or indirect things baiting users to go to a certain websites

There should be some decent heuristics to match 2).

In other words the most common kinds of content (just media files) should be relatively safe. They could still contain a video or in-torrent comments telling users to do things, but they're not immediate exploits at least.

an-electric-sheep commented 7 years ago

uploaded torrents could have a short delay before they become publicly visible. that way a spammer can't brute-force possible rules by trial-and-error. And any suspicious content can be brought to the attention of the mods by the ruleset.

Kuraperunat commented 7 years ago

Having a spam filter doesn't make any sense either. If someone wanted to spam the site with a bot, whatever magic you put into a filter is trivial to bypass.

Do not make an automated filter before spam is a real, observed issue. On Nyaa it never was or was handled appropriately by mods. Even if spam should become a thing, the filter should be heavily targeted towards that certain type of spam instead of coming up with vague rulings that are more likely to ultimately end up annoying normal users.

As for what an-electric-sheep is proposing:

Please think about the solutions in the context of the whole site, not just the anime section.

sdomi commented 7 years ago

Do not make an automated filter before spam is a real, observed issue. On Nyaa it never was or was handled appropriately by mods.

How do you know? I think that nyaa had a very, very good spamfilter, and the rest was handled by mods.

ghost commented 7 years ago

@Kuraperunat

We are talking about a Nyaa replacement, a site that had an Alexa Global rank of 1400~ (Japanese rank was 240!), and was the go-to site for downloading Anime and Manga, while Sukebei was for HGames and JAV. It will have spam, so it needs a good spam filter.

I agree with the rest of your post, any kind of silly heuristics like @an-electric-sheep is saying would be just overkill and unnecessary, and would piss people off more than help.

Argolics commented 7 years ago

Should we ban/forbid use of link shorteners (especially those with unskippable ads) in comments/torrent descriptions? They could be used to spread malicious content, or in case of ad ridden ones, grab a quick buck. I am not sure whether it should be allowed at all since this is going to be a public torrent tracker, not a platform for getting rich real quick. If anybody decides this is a good idea here's a convenient list of the ad ridden ones.

yiiTT commented 7 years ago

The majority amount of spam would be from torrent descriptions or comments, the first line of defense is the captcha on user creation and torrent uploading.

For torrent descriptions we should make some rules to prevent a user's experience from being negativity impacted and for comments it will be the whack-a-mole moderation game that comes standard with a website.

here's a convenient list of the ad ridden ones. @Argolics Seems like a good preventive measure that can be quickly implemented to save the users, which we could check on torrent upload/edit and comment posting.

loadletter commented 7 years ago

@Argolics Sukebei descriptions used a lot of links to crappy ad-ridden image hosts for screenshots