FredHutch / VISCtemplates

Tools for writing reproducible reports at VISC
Other
6 stars 2 forks source link

universal install_git for network drives #184

Open mayerbry opened 2 months ago

mayerbry commented 2 months ago

Two related issues:

  1. install_git doesn't work with network drives on Macs, because of the parsing of /Volumes/ as the front end of the file path
  2. OS have different file path mappings to the mapped network drives so reports have multiple paths written into install_git depending on who is working on it
    • Ex. remotes::install_git("N:/cavd/studies/cvd785/pdata/Ho785.git") vs. remotes::install_git("/Volumes/cavd/studies/cvd785/pdata/Ho785.git")

Code I had started


# example would be successful for `install_git_pdata("cavd/studies/cvd785/pdata/Ho785.git")`

install_git_pdata = function(pdata_path = "", try_prefix = NULL, ...){

  sys_os = .get_os()

  network_path = .make_network_path(pdata_path, sys_os, try_prefix)

  if(sys_os == "osx") .install_git_mac(network_path, ...) else remotes::install_git(network_path, ...)

  invisible()
}

.get_os <- function(){
  sysinf <- Sys.info()
  if (!is.null(sysinf)){
    os <- sysinf['sysname']
    if (os == 'Darwin')
      os <- "osx"
  } else { ## mystery machine
    os <- .Platform$OS.type
    if (grepl("^darwin", R.version$os))
      os <- "osx"
    if (grepl("linux-gnu", R.version$os))
      os <- "linux"
  }
  tolower(os)
}

. make_network_path = function(pdata_path, sys_os, try_prefix) {

  auto_prefix = . get_network_prefix(sys_os)

  all_prefix = c(auto_prefix, try_prefix)
  stopifnot(length(all_prefix > 0))

  try_paths = unique(file.path(all_prefix, pdata_path))

  check_paths = unlist(lapply(try_paths, dir.exists))
  if(!any(check_paths) stop("no directories found, try adding a new prefix to try_prefix")
  if(sum(check_paths) > 1) warning("multiple match paths") # I'm not sure if this is a real problem

  # if multiple paths work, need to figure that out
  try_paths[check_paths]

}

.get_network_prefix = function(sys_os){
  switch(sys_os, "osx" = "/Volumes", "windows" = "H:", "")
}

.install_git_mac = function(pdata_path, ...){
  stopifnot(file.exists(pdata_path))
  current_dir = getwd()
  setwd(dirname(pdata_path))
  remotes::install_git(basename(pdata_path), ...)
  setwd(current_dir)
  invisible()
}
slager commented 1 month ago

We're basically doing 2 things here:

  1. cloning a package repo
  2. installing the data package

remotes::install_git() is nice when it works because it wraps these into a single step, but I've seen it choke on directory permissions, GitHub credentials, and now the macOS network drive issue. I've had problems myself with this, and I've also helped different SRAs through these problems.

An approach that doesn't require any additional development is just to isolate the cloning from the installing.

  1. git clone the repo using whatever method you normally use
  2. if you don't want main/master, git checkout the desired branch.
  3. git pull
  4. open the datapackage .Rproj file
  5. call devtools::install() from the console

This doesn't require remotes::install_git() to correctly figure out your git permissions, directory permissions, or network drives, and doesn't require additional development to work around OS-specific differences. If you can clone it, you can install it.

Note: If there's a need to install a network drive datapackage programmatically from inside a report and across different OS, I guess this doesn't solve that. We could also consider just using visc_load_pdata() inside a report and doing the actual data package installation manually. If the pdata hash is wrong or the datapackage isn't installed, visc_load_pdata() throws an informative error.