The purpose of these scripts is to collect and organize data that shows the various size of Department of Energy social media audiences by scraping social media follower data from twitter, instagram, youtube, and (if done with a temporary API key) facebook. It can be expanded to include other platforms in the future.
There are two scripts that are housed here. Both of them can be run locally to see how they work. They are:
**The goal of this repo is get each of these scripts onto a Jenkin's Job and served onto https://energy.gov/api/social-media/
with read access allowed via CORS rules to energyapps.github.io/social
.
org_chart_data/
hourly_follower_count/
social_data.csv
from the S3 bucket. Otherwise you will lose a lot of data. Find it at https://s3-us-west-2.amazonaws.com/energy2/social/social_data.csvfollowers-hourly.py
does not complete, it will not create the new social_data.csv
file. This script will then revert to the previous data (by changing the name of social_data_old.csv
back to social_data.csv
. There is probably a safer solution for error handling but it was beyond the scope of the initial buildplatforms_mini/
While it is possible to scrape facebook user data using the API that they provide through developer tools, it is more complicated than scraping a public facing website. You are required to use an API key, but, from what I could tell, it expires frequently. I'm sure there is a developer tool that allows for a "set it and forget" method, but at the time I left the job, I hadn't had time to find this solution.
Therefore, this is why we do not track hourly facebook data in Hourly Follower Count. Additionally, it has been turned off on DOE Social Media Org Chart but can be reinstated any time someone wants to figure out a latent option.
At the time there is no elegant solution for noticing if things are broken. There are a few fail-safe's built in but they could be much improved.
[This ticket outlines what the needs are for scraping facebook with the API key]().
write tickets for folks
Add Secretary's instagram account (@secretaryperry)
Add Energy Press Sec Twitter (@EnergyPressSec)
Install all python packages on the script server.
Install and test both scripts on the script server.
Ensure that Ernie and Atiq are receiving regular updates that these are working/
Find a way to make facebook numbers update automatically without having to manually insert a temporary API key into the script.
Ensure that energy.gov/api/social-media
allows energyapps.github.io/social
via favorable CORS rules.