This repository contains several scripts developed to process Twitter data to investigate how soil organic content and health are related.
Two different data sources are used:
Data collected directly from the Twitter API
History of Data:
Automatic API Data 2019-06-15 to Current
There have been multiple "versions" of data, current code should be run on the latest version (ie. change code and directory names to v3):
For the archive data:
For the data collected via Twitter API:
Inititial data exploration:
More specific exploration and visualizations can be found in the following folders (see their respective README's for more detailed information about specific analyses):
translation folder contains scripts for translating hindi using google translate via webinterface
pre_processing folder contains scripts for specific tasks (usually run once).