Data4Democracy / far-right-analysis

Analysis related to the behavior of extreme far right online communities
35 stars 10 forks source link

Scrape bbs.dailystormer.com #22

Closed gati closed 7 years ago

gati commented 7 years ago

Write a scraper (see the scrapy Python library as a good starting point) for the Daily Stormer bulletin boards. Fair warning: this content will be really offensive.

Once you have a scraper working, grab some sample data and throw it into a sqlite database, series of JSON files, CSV files, or whatever is easiest for you. We'll throw it up on data.world and start some exploratory analysis!

samzhang111 commented 7 years ago

I have this already! I'll try to make it out to crystal city and maybe someone can help me upload it

citizenrich commented 7 years ago

Daily Stormer is now archived up to April 1.

gati commented 7 years ago

@citizenrich that's awesome! Thanks for tackling that. Is the data up on data.world, or the s3 bucket @bstarling has been managing (or somewhere else)?

citizenrich commented 7 years ago

Enjoy! It's in the s3 bucket under archive/stormer as "daily_stormer_17-04-01.sql". @bstarling

bdlacree commented 7 years ago

Hello! I'm curious if you can point me toward where to find this archive. I am new to the D4D project(s) but would be interested in the data. Thanks!

samzhang111 commented 7 years ago

@bdlacree Want to ping me on the D4D slack? I am @sam