This PR refactors the code into a targets pipeline. It also adds a DESCRIPTION file to manage dependencies.
For the simulation results, I switched to an arrow dataset. arrow is great for large datasets that need to be filtered efficiently. I gitignored the data/ folder where the arrow dataset lives. You can just use dplyr to work with the arrow object, followed by collect() to bring the data into R. Only the data_path target is an arrow dataset; model_results is a normal tibble.
Closes #3
This PR refactors the code into a targets pipeline. It also adds a
DESCRIPTION
file to manage dependencies.For the simulation results, I switched to an arrow dataset. arrow is great for large datasets that need to be filtered efficiently. I gitignored the
data/
folder where the arrow dataset lives. You can just use dplyr to work with the arrow object, followed bycollect()
to bring the data into R. Only thedata_path
target is an arrow dataset;model_results
is a normal tibble.Add new targets to
tar_plan()
in_targets.R
. New targets should be generated by functions located inR/
.Run
targets::tar_make()
to build the pipeline. You can also use the build pane in RStudio or the build shortcut (cmd
/ctrl
+shift
+B
)Results are cached in
_targets
.