DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Make adding new coop data and parsers less brittle #231

Closed jordansread closed 2 years ago

jordansread commented 3 years ago

Issue to capture discussion on this topic.

There are a lot of small steps you need to remember to get through builds. Some targets need force builds (e.g.., parser inventory and gdrive file inventory)

Taking this example from an old scipiper discussion on how to make a target "always stale" in case it would help:

target_default: all

packages:
  - sysfonts
  - showtext 

sources:
  - example_utils.R

targets:
  all:
    depends:
      - set_font
      - image.png

  set_font:
    command: set_fonts(I('Abel'), I('abel'),
      trigger_file = 'always_stale_time.txt')

  image.png:
    command: plot_things(target_name,
      trigger_file = 'always_stale_time.txt')

example_utils.R

make_stale <- function(file){
  cat(file = file, paste0(format(Sys.time(), '%m/%d/%y %H:%M:%S '), sample(10000, 1)))
}

set_fonts <- function(..., trigger_file){
  make_stale(trigger_file)

  font_add_google(...)
  return(TRUE)
}

plot_things <- function(fileout, trigger_file){
  make_stale(trigger_file)

  png(filename = fileout)
  par(family = 'abel')
  showtext::showtext_begin()
  plot(cars)
  text(15, 100, 'this is a thing I plotted', cex = 2)
  showtext::showtext_end()
  dev.off()
}
jordansread commented 3 years ago

I think there are a few brittle spots here, but one simpler one is the parser file list target in 7a_temp_coop_munge.yml:

  7a_temp_coop_munge/tmp/parser_files.yml:
    command: list_coop_files(target_name,
      dirpath = I('7a_temp_coop_munge/src/data_parsers'), dummy = I('2021-09-13'))

Someone doing the work and adding a parser or modifying a parser needs to know to change the dummy input here to reset the local yaml list of the parser files (like this).

I think in this case we'd want something like this:

target_default: all

sources:
  - example_utils.R

targets:
  all:
    depends:
      - 7a_temp_coop_munge/tmp/parser_files.yml
      - 7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind

  7a_temp_coop_munge/tmp/parser_files.yml:
    command: list_coop_files(target_name,
      dirpath = I('7a_temp_coop_munge/src/data_parsers'),
      trigger_file = '7a_temp_coop_munge/tmp/always_stale_time.txt')

  coop_parsers:
    command: find_parser(coop_wants, '7a_temp_coop_munge/tmp/parser_files.yml',
      trigger_file = '7a_temp_coop_munge/tmp/always_stale_time.txt')

with something like the above called on the trigger_file each time it is used

make_stale <- function(file){
  cat(file = file, paste0(format(Sys.time(), '%m/%d/%y %H:%M:%S '), sample(10000, 1)))
}