sergiotapia / magnetissimo

Web application that indexes all popular torrent sites, and saves it to the local database.
MIT License
3k stars 190 forks source link

Database Maintenance #62

Closed Elly6 closed 6 years ago

Elly6 commented 7 years ago

Occasionally uploaders on torrentdownloads or thepiratebay will submit torrents with 9000+ or even 20,000+ seeders. These torrents are labelled to appear legitimate but contain malware.

I'm curious to know how you might handle scrubbing the database of fake/spam torrents? Through postgres? Command line? Or modifying magnetissimo to ignore fake torrents altogether?

sergiotapia commented 7 years ago

We can set a threshold and ignore those torrents that have an obscene amount of seeders/leechers.

Can you provide some screenshot examples? Thank you :)

Elly6 commented 7 years ago

Assigning a threshold (9,000+ seeders for example) could unintentionally omit legitimate torrents.

Fake torrents: magnetissimo1

Legit torrents: magnetissimo2

sergiotapia commented 7 years ago

@Elly6 Circling back to this issue after a few months - I'm not sure there's an easy way to detect whether a torrent is fake or not. Do you have any ideas?

jspraul commented 7 years ago

https://torrentfreak.com/btdigg-shut-down-due-to-torrent-spam-for-now-160711/

https://btdig.com is back (?? maybe some other project: https://github.com/kevinlynx/dhtcrawler2) after promising "When we finish creating an AI that filters spam, we’ll reopen the site."