wdwatkins / ds-pipelines-targets-1

https://lab.github.com/USGS-R/intro-to-targets-pipelines
0 stars 0 forks source link

move script to fetch directory #4

Closed wdwatkins closed 3 years ago

wdwatkins commented 3 years ago

Run by sourcing script.R in top level directory

github-learning-lab[bot] commented 3 years ago

Great, your PR is open! Let's do some more work before merging it. Now that your files are organized into phases, next you will add a commit to your pull request that makes changes to the code itself.

Background

In addition to phases, it is important to decompose high-level concepts (or existing scripts) into thoughtful functions and targets that form the building blocks of data processing pipelines. In addition to being the name of the pipeline package, a target is a noun we use to describe a tangible output of function, often a file or an R object, that we can use as an end-product (like a summary map), or as an input into another function.

We strive to create functions that declare their clear purpose (combining good function naming with thoughtful arguments/inputs is helpful for this) and are designed for re-use when appropriate. When writing pipelines functions, look for areas of re-usable operations or places where simple dividing lines can be drawn between different parts of data access, processing, modeling/analysis, and visualization. We use the high-level "phases" to divide the major concepts, but the way we scope functions is an additional subdivide. It is a best practice to have a function do a single thing, so instead of creating two plots and a table, it might be better to use one function to generate a table, which is then used as input to another function to create a plot. There are exceptions to this pattern (a 1:1 function-to-target pairing) that we'll get into later.

:keyboard: Activity: Modify existing code to create functions that generate plot, table, and log file outputs

We started you off with an example script in the my_work_R folder, which hopefully lives in either 1_fetch/src or 2_process/src by now. This script loads data and generates one plot, two comma-delimited tables, and a diagnostic log file. This script isn’t great and includes some bad practices that need to be cleaned up. But it should run for you without any changes as long as you are able to install the R packages used by the script.

We’re asking that you split this single script into several functions that can be used to build the same four things. When you are happy with your changes, delete the original script and commit your new script(s) into git source control. Use your same folder structure that was created for your open PR, but feel free to add a "3_visualize" phase.

Since you are turning the script into functions, let us know via a comment made to the pull request conversation that specifies how to run your code. For example:

data <- fetch_data()
plot_results(data)

It is harder for us to connect robot responses up to assignments related to writing good functions, so we're going to be tagging the humans too...

Push your commit(s) to the open pull request and assign your course contact for review.


A real live human will review your pull request when you've added them as a reviewer.

github-learning-lab[bot] commented 3 years ago


When you are done poking around, check out the next issue.