Open sjacks26 opened 7 years ago
Will start working on it right now!
@GuiMarthe @sjacks26 Is this still in progress?
I've done 1 but didn't send a PR for what ever reason!
Nice! Can you submit a PR? I'd love to include it!
On Tue, Jul 18, 2017, 7:29 AM Guilherme Marthe notifications@github.com wrote:
I've done 1 but didn't send a PR for what ever reason!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Data4Democracy/far-right-analysis/issues/9#issuecomment-316081924, or mute the thread https://github.com/notifications/unsubscribe-auth/AAhLi-poxVq89Jaan94ydifkZFeR87TJks5sPMFDgaJpZM4L-XbJ .
@bstarling put together a notebook that explains how to access Breitbart articles to do analysis in Python. If you need access to the data.world dataset, ping @jonathon or @sharon in Slack
To start analysis, here are some basic NLP ideas:
Generate word counts in article leads across the whole datasets (filtering stopwords) 1b. Generate word counts (and/or tf-idf) for article leads sorted by category 1c. Generate word counts (and/or tf-idf) for article leads sorted by author (possibly excluding Breitbart News and Breitbart TV)
Search article leads for keywords of interest (Trump, Putin, alt-right, pepe, etc.) 2b. Plot number of article leads with a given keyword over time (for example, number of article leads mentioning Trump by week)
Search for trends in links to other website (for example, are there more links to nytimes.com during national political campaigns?)
Have more ideas for this dataset? Post them here, or propose them in #far-right in Slack.