Open cmhoove14 opened 3 years ago
This is process for Safegraph data that needs to be automated:
Download files from safegraph aws s3 sync s3://sg-c19-response/social-distancing/v2/[year]/[month] ./aws_downloads --profile safegraphws --endpoint https://s3.wasabisys.com
Move all .csv.gz from day subdirectories mv / .
Unzip all .csv.gz files gunzip *.gz
Now all the safegraph data is in the same folder, so just have to process it for use in models
sf_visitors.sh uses safegraph_sf_visitors.R to extract visitors to SF from other CA counties
output file SF_visitors[startdate]to[enddate].rds then needs additional processing in process_sf_visitors.R
parse_safegraph_SF_CBGs.sh uses parse_safegraph_SF_CBGs.R to get between CBG movement for sf
output file SFCensBlockGroupsMvmt[startdate]to[enddate].rds then needs additional processing into list of probability matrices in safegraph_cbg_mvmt_process.R
parse_safegraph_devices.sh uses parse_safegraph_devices.R to get device summaries and is then additionally processed in SF_safegraph_devices_analyze.R to get pct at home metrics
Generate one script that will download and process safegraph data for easy updating