MilesMcBain / tflow

An opinionated lightweight template for smooth targets flows.
Other
90 stars 9 forks source link

conflicted doesn't work well with `tar_make_future` #6

Open MilesMcBain opened 3 years ago

MilesMcBain commented 3 years ago

Conflicted preferences described in packages.R are not propagated to workers, and so hit errors on unresolved conflicts.

Will Landau says the data lives in a weird environment that is not convenient to copy over.

The suggested workaround is to store the conflicted::conflict_prefer() calls in a project .Rprofile file that will be automatically run by the workers.

I have reservations about project .Rprofiles, since they clobber the user .Rprofile.

It may be possible to store the settings in another file which the worker nodes could be configured to use, e.g. :

tar_make_future(workers = 2, callr_arguments = list(
            user_profile = TRUE,
            env = c(R_PROFILE_USER = "./.conflicts")
            ))

This does not currently seem possible since callr doesn't respect R_PROFILE_USER supplied in env vars - https://github.com/r-lib/callr/issues/193

noamross commented 3 years ago

As long as I'm here, I'll say that I put my conflicted statements in a project-level .Rprofile, but I also source the user .Rprofile when interactive. Here's the .Rprofile of from my evolving project template:

if (file.exists(".env")) {
  try(readRenviron(".env"))
} else {
  message("No .env file")
}

if (file.exists("renv/activate.R")) {
  source("renv/activate.R")
} else {
  message("No renv/activate.R file. Is renv set up?")
}

# Use the local user's .Rprofile when interactive.
# Good for keeping local preferences, but not always reproducible.
user_rprof <- Sys.getenv("R_PROFILE_USER", normalizePath("~/.Rprofile", mustWork = FALSE))
if(interactive() && file.exists(user_rprof)) {
  source(user_rprof)
}

options(
  renv.config.auto.snapshot = TRUE, ## Attempt to keep renv.lock updated automatically
  renv.config.rspm.enabled = TRUE, ## Use RStudio Package manager for pre-built package binaries
  renv.config.install.shortcuts = TRUE, ## Use the existing local library to fetch copies of packages for renv
  renv.config.cache.enabled = TRUE,   ## Use the renv build cache to speed up install times
  renv.config.cache.symlinks = FALSE  ## Keep full copies of packages locally than symlinks to make the project portable in/out of containers
)

# If project packages have conflicts define them here
if(requireNamespace("conflicted", quietly = TRUE)) {
  conflicted::conflict_prefer("filter", "dplyr", quiet = TRUE)
  conflicted::conflict_prefer("count", "dplyr", quiet = TRUE)
  conflicted::conflict_prefer("geom_rug", "ggplot2", quiet = TRUE)
  conflicted::conflict_prefer("set_names", "magrittr", quiet = TRUE)
  conflicted::conflict_prefer("View", "utils", quiet = TRUE)
}
Aariq commented 2 years ago

You can also use tar_hook_before(hook = conflicted::conflict_prefer("filter", "dplyr"), names = <target names or tidyselector>) which will work with HPC and doesn't require a .Rprofile