ottofabian / NLP4Web_Project

NLP4Web Projekt Repository WS 17/18 TU Darmstadt
0 stars 3 forks source link

Find Additional Domains and Persons for more Tweets #26

Open ottofabian opened 6 years ago

ottofabian commented 6 years ago

The goal should be to get 2 - 4 more domains in order to check if the Authorship identification also works for different domains of "Tweeters". Possible ideas:

mrnyc54 commented 6 years ago

Crawled data for domains: Sports [Celebrities] Broadcast Star Actors Politics FunMix (Contains various Twitterprofiles that may not match threshhold criteria, but could be interesting anyways. E.g. Pope Francis, Dalai Lama, Lord Voldemort etc.)

Each domain contains at least 20 individual accounts, with 1000-1100 Tweets each. So 5 Categories a 20 authors x 1000 Tweets = at least 100.000 Tweets. Will Push soon