Closed github-learning-lab[bot] closed 3 years ago
I enjoyed Karl's overview of reproducible workflows and automation. Many points, especially regarding directory organization, are things I've learned the hard way myself and would have benefitted from hearing much sooner. :) His summary of GNU make was particularly helpful for understanding how that tool functions.
I believe the bot got confused! You commented over on #4 and it put its response there. Sorry, I haven't had this happen before. I see you created a new issue, #7. I would suggest just leaving that alone for now (don't close or comment or anything). The bot should just ignore it and carry on, but if you interact with it, it may get confused again since it is progressing through steps based on seeing a comment, issue closure, or PR merge.
Please pretend it responded here:
Great comments @Kfaunce! :sparkles:
You could consider GNU make to be a great grandparent of the packages we referred to early in this lesson (remake
, scipiper
, drake
, and targets
). Will Landau, the lead developer of targets
, has added a lot of useful features to dependency management systems in R, and has a great way of summarizing why we put energy into using these tools: "Skip the work you don't need"
We'd like you to next check out a short part of Will's video on targets
Use a github comment on this issue to let us know what contrasts you identified between solutions in make
and what is offered in R-specific tools, like targets
. Please use less than 300 words. Then assign your onboarding cohort team member this issue to read what you wrote and respond with any questions or comments.
Response to Activity: watch video on targets - whereas make
is a language-agnostic pipeline tool, targets
was designed specifically for R. In contrast to make
, targets
allows you to work from an R session and generally directly with R objects instead of with the output files.
We're asking everyone to invest in the concepts of reproducibility and efficiency of reproducibility, both of which are enabled via dependency management systems such as
remake
,scipiper
,drake
, andtargets
.Background
We hope that the case for reproducibility is clear - we work for a science agency, and science that can't be reproduced does little to advance knowledge or trust.
But, the investment in efficiency of reproducibility is harder to boil down into a zingy one-liner. Many of us have embraced this need because we have been bitten by issues in our real-world collaborations, and found that data science practices and a reproducibility culture offer great solutions. Karl Broman is an advocate for reproducibility in science and is faculty at UW Madison. He has given many talks on the subject and we're going to ask you to watch part of one of them so you can be exposed to some of Karl's science challenges and solutions. Karl will be talking about GNU make, which is the inspiration for almost every modern dependency tool that we can think of. Click on the image to kick off the video.
:computer: Activity: Watch the above video on make and reproducible workflows up until the 11 minute mark (you are welcome to watch more)
Use a GitHub comment on this issue to let us know what you thought was interesting about these pipeline concepts using no more than 300 words.
I'll respond once I spot your comment (refresh if you don't hear from me right away).