wlandau / drake-examples

Example workflows for the drake R package
https://github.com/ropensci/drake
56 stars 14 forks source link

new example: orderly + drake #41

Open wlandau opened 4 years ago

wlandau commented 4 years ago

According to @richfitz, orderly and drake could complement each other nicely. If we post an example here, drake users will be able to download it with drake::drake_example("orderly") and try it out for themselves.

From the docs, it looks like orderly could wrap around a drake workflow and manage multiple versions of final artifacts rendered at the end of a drake plan. Are there other obvious win-wins?

wlandau commented 4 years ago

I just got started on an example in this branch. It is just like orderly's minimal example but with a drake workflow in src/example/script.R.

https://github.com/wlandau/drake-examples/blob/7d1e43b2e7e82bee07afa2406960af78c2d0d129/orderly/src/example/script.R#L1-L16

wlandau commented 4 years ago

Should drake be declared as a package dependency in src/example/orderly.yml?

wlandau commented 4 years ago

One source of friction I notice is that orderly creates and sets a new working directory for each new run, while drake expects all runs to use the same file system and the same working directory. Even if we assign a storr_rds() cache in a central location that all runs can access, it is still awkward when we declare a drake file_out() with a run-specific path. @richfitz, how would you suggest we get the most out of both tools in this situation?

richfitz commented 4 years ago

yes, I can imagine that this is a source of friction, and tbh it's one that is fairly fundamental to orderly (minimising state between runs of an analysis). Though there are two ways to deal with this that might help:

  1. If one develops an analysis outside of orderly, interactively, and treats orderly as the final copy that will be run periodically, then you can develop the final copy in. We have work that has used this workflow in use at work.
  2. If one is going to develop interactively, then there is some support with orderly::orderly_test_start to do the directory set up (though sadly not change) following which one can write and work with R code almost as usual. This workflow needs work (you're editing files in a dir that is not the working directory) and we've not yet worked out a superb way of doing it. I have a few ideas for removing pain here but it's not really worked out yet.

Thanks for the example 😄

richfitz commented 4 years ago

(minimally if the project has really simple dependencies, then with judicious use of .gitignore one could just work in the src/example directory actually)