2DegreesInvesting / ds-incubator

2° Investing Initiative, ds-incubator website / eBook:
https://bit.ly/ds-incubator-videos
1 stars 4 forks source link

Reproducible data science with targets #70

Open maurolepore opened 3 years ago

maurolepore commented 3 years ago

Who is the audience?

R users who need to run complex and slow data pipelines multiple times. This includes folks at 2DII and beyond.

Why is this important?

We are facing a reproducibility crisis. Few analyses are easy or even possible to reproduce. The challenge is particularly hard for analyses involving data pipelines that are complex and slow to run. When data pipelines are complex the order in which to run the code is difficult to understand; and when they are slow to run their quality suffers because they are too expensive to maintain. The targets package tackles these problems. It provides a way to explain in code the steps required to reproduce the analysis, and it saves time because it skips costly runtime for tasks that are already up to date.

See also https://books.ropensci.org/targets/index.html#motivation

What should be covered?

This series will cover the content from target's manual that is most relevant to the work we do at 2DII.

https://github.com/2DegreesInvesting/ds-targets#syllabus

Suggested speakers or contributors

@wlandau, @maurolepore, @cjyetman?

Resources

https://github.com/2DegreesInvesting/ds-targets#resources

Checklist

2h before

10' before

Start