Closed ghost closed 3 years ago
Thanks for these ideas. Several users have asked how to use multiple caches like this.
I would actually prefer to treat this as an optional programming technique rather than a built-in feature. The multiple-cache approach has significant problems and reasonable workarounds. (In fact, targets
actively resists multiple caches.) Problems:
Workarounds:
make()
different plans with the same cache.readd()
s from those caches. (Literally, the only R code in the chunks should be readd()
statements, with the possible exception of library(drake)
.) In this setup, everything is a target until the very last step, and the need to shuffle around to different caches is kept to a minimum.I have yet to find an example project that cannot be expressed in terms of (1) or (2) above.
Thanks, I agree it is definitely inefficient to duplicate the target storage amongst several caches, my team works in a secure environment which doesn't have access to Git, hence we're trying to be as safe and robust as possible.
I'll give workaround 1. a go, haven't tried multiple plans within the same cache, I assume we have to be very careful with target naming? Two different plans within the same cache should not share the same target name?
I assume we have to be very careful with target naming? Two different plans within the same cache should not share the same target name?
Yes, that's right. Otherwise, in the worst-case scenario, you will have a self-invalidating workflow. Naming is hard with any kind of programming, and most attempts at a solution are outside the scope of drake
.
Prework
This is a follow up to Issue 1100 Using targets imported from another cache.
Proposal
I've created two functions,
trigger_from
andtarget_from
which simplify the user input process.In my work we've used multiple drake plans/projects to keep them small and readable. A final step in our project is to import data from the caches and combine them together for final analyses. The
trigger_from
andtarget_from
also work withtransform
.Example:
Some extra details regarding the functions:
bquote()
partially substitutes and evaluates an R expression. Here the partial evaluation occurs within.()
. For example.(target_name)
gets replaced with the user definedtarget_name
.ignore()
tells drake not to look for depencies within this plan, as the targets are coming from external drake plans.eval()
evaluates the R code expression frombquote()
, try running thebquote()
section without theeval()
to understand.