Closed charlesbaillie closed 3 years ago
Using static branching in drake I can get something similar, where each report target is a separate row in the plan, but it feels a little hacky making the paths first (since file_out() and friends only take strings), and in the plan in the below example the target 'files' isn't connected to anything.
Looks like you are trying to define new targets based on the values of previous targets, which means we don't know what the files of the reports are going to be until the upstream targets run. In that case, I would go with dynamic branching. That means dynamic files (format = "file"
) are appropriate because counterintuitively file_out()
is incompatible with dynamic branching. But knitr_in()
should be if none of the dependencies in the reports are dynamic sub-targets. Sketch:
plan <- drake_plan(
penguin_data = penguins %>% group_by(species) %>%
summarise_if(is.numeric,list(min, max)) %>% mutate_at(vars(species),as.character),
files = data.frame(
species = penguin_data$species,
path = paste0("report/report_", penguin_data$species, ".html")
),
report = target({
render(
input = knitr_in("doc/template.Rmd"),
output_file = files$path,
params = list(species = files$species)
)
# Just underscoring here that the output path should be returned for format = "file".
# rmarkdown::render() does that anyway.
files$path
},
format = "file", # Track the returned output file path.
dynamic = map(files) # Maps over the rows and makes one sub-target per row.
)
)
Also, is it correct to have each report as a separate target, or should it just be one target since all reports will need to be updated when the data is updated anyway? In reality I'm going to have >150 rmarkdown reports and potentially other outputs like slides decks, hence why in doc/ I'd like to keep just the 'templates', and then reports/ will have the rendered reports or slides.
Your choice. It depends on how long the computation is. Also, if all 150 reports are quick and they all tend to invalidate all at once (either all are outdated or none are outdated at any given time) they you might as well put all them in a single target.
Background
If I wanted to produce a number of reports from one template, without drake I would do something like:
Using static branching in drake I can get something similar, where each report target is a separate row in the plan, but it feels a little hacky making the paths first (since
file_out()
and friends only take strings), and in the plan in the below example the target 'files' isn't connected to anything. Also, is it correct to have each report as a separate target, or should it just be one target since all reports will need to be updated when the data is updated anyway? In reality I'm going to have >150 rmarkdown reports and potentially other outputs like slides decks, hence why indoc/
I'd like to keep just the 'templates', and thenreports/
will have the rendered reports or slides.Example
This is my hack using static branching:
I have a drake project set up like this: