Open gati opened 7 years ago
Looking into this. I think can raise a PR by end of weekend.
Sounds awesome @ybot1122! Just let me know if I can help with anything
Sorry for the lateness. Hope it can still be used for future analysis. I have very hacky, unoptimized python script which can take an array of PageIDs and simply return a flat array of all the statuses, feed posts, and comments from the page.
output directory structure:
output/
Holy Nation of Odin/
feed.txt
statuses.txt
comments.txt
.txt file structure
[
"Latest Radio Series is now online. Wilmot Robertson\u2019s The Dispossessed Majority \u2013 Part 2\nhttp://tinyurl.com/a2z-radio",
"Check out the latest Radio Show - I discuss Wilmot Robertson\u2019s 1972 book, \u201cThe Dispossessed Majority.\u201d \n\nhttp://tinyurl.com/a2z-radio"
]
the script: https://gist.github.com/ybot1122/93c216072d9564ea99250b33fecf6680 the output for the pages listed: https://s3-us-west-2.amazonaws.com/random-stuff-toby/output.zip (P.S. Repent Amarillo returned a page not found)
Using
collect-social
(https://github.com/data4democracy/collect-social), or a library you're more familiar with (@gati/@jonathon in Slack can help with this), grab the posts, comments, users, etc from far-right/alt-right Facebook pages. A list of pages associated with hate groups, holocaust deniers, etc, is attached.Fair warning: the content will be unpleasant.
Ideally you'll drop the data in a sqlite database, series of JSON files, CSV files, or another format that's easy for you, and we'll upload to data.world!