kinto-b / makepipe

Tools for constructing simple make-like pipelines in R.
https://kinto-b.github.io/makepipe/
GNU General Public License v3.0
30 stars 0 forks source link

Implement clean and rebuild functions #3

Closed kinto-b closed 3 years ago

kinto-b commented 3 years ago

Once the pipeline has been executed, it stores information on the locations of all targets, dependencies and source files on disk. It also stores each recipe as a string.

We can leverage this to implement clean and rebuild functions:

The most challenging part of this will be figuring out what the 'appropriate order' of execution is. GNU make decides where to start using topological sorting. Possibly there's an algorithm in visNetwork that I can use. Otherwise, I'll need to code something up.

Once this is done, it might be neat to introduce a pipeline() function for defining a pipeline in one fell-swoop

my_pipeline <- Pipeline$new(
  list(
    dependencies = c("data/0_raw_data.csv", "lookup/concordance.csv"),
    source = c("1 data_prep.R"),
    targets = c("data/1_data.Rds")
  ),
  list(
      dependencies = c("data/1_data.Rds", "data/0_pop.Rds"),
      recipe = {
        dat <- readRDS("data/raw_data.Rds")
        pop <- readRDS("data/pop_data.Rds")
        merged_dat <- merge(dat, pop, by = "id")
        saveRDS(merged_dat, "data/2_data.Rds")
      },
      targets = c("data/2_data.Rds")
  )
)

rebuild_pipeline(my_pipeline)

See also:

kinto-b commented 3 years ago

Will be in next release