USGS-R / nawqa-pesticide-nsp

Workflow Updates for the NAWQA Pesticide National Synthesis Project
0 stars 3 forks source link

Determine why `tar_files_input` results in an error for `p1_pest_bin_csv` #6

Open padilla410 opened 2 years ago

padilla410 commented 2 years ago

A reproducible example of this issue is available on the $debug-targets-error branch.

In the $main branch, I am using tarchetypes::tar_files() instead of tar_target() to create targets because i want to track all the data input files that are being used to generate the pesticide maps. Using tar_files allows me to track all of the input files and to then branch over them to load the data (doc reference here).

The problem with using tar_files() is that is results in downstream targets that appear to be outdated when they are not. Here is a zoomed in example: image

This is a known issue with tar_files. Here is a snippet from the tarchetypes doc:

tar_files_input() is like tar_files() but more convenient when the files in question already exist and are known in advance. Whereas tar_files() always appears outdated (e.g. with tar_outdated()) because it always needs to check which files it needs to branch over, tar_files_input() will ap- pear up to date if the files have not changed since last tar_make(). In addition, tar_files_input() automatically groups input files into batches to reduce overhead and increase the efficiency of par- allel processing.

The problem is, when I make the switch from tar_files to tar_files_input, it works for some targets (e.g., p1_pest_hi_dbf and p1_pest_lo_dbf) but not others (e.g., p1_pest_bin_csv, p1_pest_label_csv`). Here is the error that I bump into after making the switch:

Error in purrr::map_chr(., ~grep(paste("(\\.", .x, "\\.)", sep = ""),  : 
  object 'p1_pest_of_interest' not found
Error in `tar_throw_run()`:
! callr subprocess failed: object 'p1_pest_of_interest' not found
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Run `rlang::last_error()` to see where the error occurred.

AND, weirdly, when I debug using tar_option_set(debug = "p1_pest_bin_csv"), tar_make(callr_function = NULL) and then "step into the current function call" using the debugger, it works without an issue (successfully cycling through the 69 pesticides of interest listed in p1_pest_of_interest.

A few other required mods in $debug-targets-error

In addition to converting tar_files calls to tar_files_input in 1_fetch.R I also had to make the following changes to the repo to get it to run: