The present repository is an extract from a project that uses data from Microsoft Academic Graph (MAG) and ProQuest. The programs work on a remote computer that has the data stored outside of the repository. The path to the data is defined in src/analysis/setup.R
and src/dataprep/helpers/variables.py
.
src/
: data preparation and linking; analyze the publication careers of scientists.output/
: destination for tables and figures generated in src/
. snapshots/
: contains files to reproduce the environments (for now, yml
files for conda).src/dataprep
The pipeline.sh
script calls consecutively the scripts for
src/analysis
The pipeline.sh
script calls consecutively the scripts for
setup.R
and in helpers/
See open issues for some features that are currently lacking.
analysis
and dataprep
because the scripts come originally from two different repositories.