Closed MattCowgill closed 2 years ago
Should we also suggest that people use an R/
folder, and then a top-level Rmarkdown
document that ties everything together -- ie runs all the code, in order, with a lil' explanation of what each is doing?
I'm doing that for the super-wages thing atm and I think I like it. It's just:
Read the WAD data (R/01_eba_read
) and retrive the external economic data (R/02_economic_read
), before combining, filtering and adding variables to the dataset (R/03_join_transform.R
).
if (rebuild) {
source("R/01_eba_read.R")
source("R/02_economic_read.R")
source("R/03_join_transform.R")
}
Run the regression models, including robustness checks and bootstrapping.
if (remodel) {
source("R/04_model.R")
}
The key graphics in this document explore the data used in the models ("04_model.R"
).
They all use the run_vars
dataset, which is a subset of the eba_agreements
dataset, and therefore need to be run after the model is generated.
The proposed- and legislated-history of the SG in Australia is one of the main variables we use.
## Plot
source("R/06_plot_sg_history.R")
## Table
source("R/f_get_sg.R")
sg_table <- get_sg_history() %>%
filter(row_number() != 1) %>%
select(`Date` = date,
`Super Guarantee (less than $1m payroll)` = sg_small,
`Super Guarantee (more than $1m payroll)` = sg_large)
sg_table %>%
kable("latex", booktabs = TRUE) %>%
column_spec(2:3, width = "4cm") %>%
kable_styling(position = "center")
Little plotlets of main model variables, shown in the Data chapter, are produced using:
source("R/06_plot_variables.R")
The economic time-series variables are also shown in the Data chapter.
source("R/06_plot_timeseries.R")
source("R/06_plot_agreement_sample.R")
source("R/06_plot_boot.R")
I like this workflow but I feel like it’s not always going to make sense. Often people will be creating a lot of small scripts that do a series of unrelated things (eg. make charts), sometimes using different datasets. I think your WAD work is more integrated than most Grattan projects.
So I think we should outline and suggest this type of set up for when you’re doing complex analyses with multiple linked files. The other folder structure stuff is more “this is how you should/must structure your folders”
I dunno
Add screenshot(s) of how to structure a project folder, with the appropriate subfolders
On @gregmoran's suggestion