In developing a brand on Twitter (and social media in general), how does what you say and how you say it correspond to positive results (more followers, for example)?
Directory containing all data needed to reproduce paper. All files should be under (I no longer have access to the grid): /exp/abenton/twitter_brand_workspace_20190417/
Table with static user metadata: crowd-sourced fields (both reconciled labels as well as individual crowdworker labels) and from initial sample of Twitter profile
Dynamic table with all features in our analysis. Each row should contain follower count change by horizon, each covariate by history window.
Release tweet IDs for each user along with timestamps. If we can figure ,out how to have users download up to 50K tweets per day, then we can also publish the raw statuses (roughly 600K tweets in our data in total, should take a user 12 days to download). Relevant Twitter terms of service: https://developer.twitter.com/en/developer-terms/agreement-and-policy#id8
README describing what each column in each of the files mean, and how they relate to the paper.
Reproducibility Bonus:
Include code to rerun analyses on table with extracted features. A reader of the paper should be able to pull the data from github, run a shellscript we provide, and reproduce our results table.
I no longer have a CLSP grid account. I emailed the new admin to see about getting mine restored so I have COE grid access.
I thought GitHub had a size limit on repos and therefore was theoretically and practically best only for source code and not data. Has this changed? If not, where will we host the data?
Deliverables
Reproducibility Bonus: