corcra / troll-interconnector

ask not for whom the bell trolls
MIT License
0 stars 0 forks source link

what maketh a troll (get training data) #6

Closed corcra closed 2 years ago

corcra commented 9 years ago

We need to train our algorithm to identify trolls. (calculate_troll_score in tweep.py)

This can either be hand-devised with rules, or we can try to learn it using a database of known trolls. Maybe we want both.

  1. Can someone find a list of trolls? One of those autoblock lists might be informative, although perhaps a bit biased.
  2. What heuristic rules would we use, personally?
corcra commented 9 years ago

For part #1: I'd like a list of 20 known trolls and 20 known non-trolls. Obviously this is a tiny data set for training purposes, but right now I'm not sure how to automatically acquire high-quality data like this. Ideas welcome.