prioritizr / benchmark

Benchmark performance of exact algorithms solvers for conservation planning
GNU General Public License v3.0
0 stars 0 forks source link

Attempt benchmarks #4

Closed jeffreyhanson closed 3 years ago

jeffreyhanson commented 3 years ago

I've managed to get the benchmark analysis running on a server, so I think it's all working now. @ricschuster, when you get a chance, could you please try running it on your system and see if it works? The Makefile is currently configured to run a small pared down version of the analysis to help identify errors/issues quickly. So, if use the system command make clean all that should be a good test? Once we've verified that it works correctly, I'll update the parameters in the Makefile to run the full analysis. How does that sound?

ricschuster commented 3 years ago

I think packrat is mostly there, but I do get an error with the command make install:

R CMD BATCH --no-restore --no-save '--args --bootstrap-packrat' packrat/init.R
make: *** [Makefile:53: install] Error 1

When I start R, it looks like its related to the gurobi package:

Error in library(x, character.only = TRUE, quietly = TRUE) : 
  there is no package called ‘gurobi’
In addition: Warning messages:
1: In file.symlink(from, to) :
  cannot symlink '/usr/lib/R/library/base' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-R/x86_64-pc-linux-gnu/4.0.4/base', reason 'Function not implemented'
2: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/gurobi' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/gurobi', reason 'Function not implemented'
3: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/slam' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/slam', reason 'Function not implemented'
4: In symlinkExternalPackages(project = project) :
  The following external packages could not be linked into the packrat private library:
- 'gurobi', 'slam'

both /usr/local/lib/R/site-library/gurobi and /usr/local/lib/R/site-library/slam do exist.

Do you know that the issue could be?

jeffreyhanson commented 3 years ago

Hmm, there should be a init.Rout file somewhere in the repository, either in the benchmark directory or in benchmark/data/intermediate directory. Can you find it? If so, can you open it scroll to the end of it and see if it threw any warnings?

jeffreyhanson commented 3 years ago

Packrat is (currently) configured to use symlinks to avoid copying the gurobi and slam folders. It seems like this might not be working on your system. Could you please try creating a symlink on your computer and see if it works? E.g. something like this in the terminal:

# create folder
mkdir test-dir1
# make file
echo "this is a test" &> test-dir1/test.txt
# create symlink folder test-dir2
ln -s test-dir1 test-dir2
# see if we can find the test file
cat test-dir2/test.txt

After running these commands, you should see this printed to screen:

this is a test
ricschuster commented 3 years ago

Hmm, there should be a init.Rout file somewhere in the repository, either in the benchmark directory or in benchmark/data/intermediate directory. Can you find it? If so, can you open it scroll to the end of it and see if it threw any warnings?

Here the log:


R version 4.0.4 (2021-02-15) -- "Lost Library Book"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Error in library(x, character.only = TRUE, quietly = TRUE) : 
  there is no package called ‘gurobi’
Calls: source ... setPackratModeOn -> afterPackratModeOn -> lapply -> FUN -> library
In addition: Warning messages:
1: In file.symlink(from, to) :
  cannot symlink '/usr/lib/R/library/base' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-R/x86_64-pc-linux-gnu/4.0.4/base', reason 'Function not implemented'
2: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/gurobi' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/gurobi', reason 'Function not implemented'
3: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/slam' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/slam', reason 'Function not implemented'
4: In symlinkExternalPackages(project = project) :
  The following external packages could not be linked into the packrat private library:
- 'gurobi', 'slam'
No traceback available 
ricschuster commented 3 years ago

Packrat is (currently) configured to use symlinks to avoid copying the gurobi and slam folders. It seems like this might not be working on your system. Could you please try creating a symlink on your computer and see if it works? E.g. something like this in the terminal:

# create folder
mkdir test-dir1
# make file
echo "this is a test" &> test-dir1/test.txt
# create symlink folder test-dir2
ln -s test-dir1 test-dir2
# see if we can find the test file
cat test-dir2/test.txt

After running these commands, you should see this printed to screen:

this is a test

That worked. That means symlinks should work, right?

jeffreyhanson commented 3 years ago

Yeah, I would have thought so. Just to be sure, I'll update the packrat config to disable symlinks to verify that this isn't the issue. I'll let you know when I've pushed an update.

ricschuster commented 3 years ago

Thanks Jeff!

jeffreyhanson commented 3 years ago

Ok, I've pushed pushed an update to disable symlinks. Unfortunately, I do not know how to restart the packrat setup after it fails. So, could you please delete your local copy of the benchmark repo, clone the repo from GitHub again, and use the command make install to try installing the dependencies via packrat?

ricschuster commented 3 years ago

Now its complaining about cplexAPI.

I think its because I have it installed here: /home/richard/R/x86_64-pc-linux-gnu-library/4.0/ and its looking for it here: '/usr/lib/R/bin/R'

Is that the case from looking at the log?

Packrat is not installed in the local library -- attempting to bootstrap an installation...
> Installing packrat into project private library:
- 'packrat/lib/x86_64-pc-linux-gnu/4.0.4'
* installing *source* package ‘packrat’ ...
** package ‘packrat’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (packrat)
> Attaching packrat
> Restoring library
Installing BH (1.75.0-0) ... 
    OK (built source)
Installing DBI (1.1.1) ... 
    OK (built source)
Installing KernSmooth (2.23-18) ... 
    OK (built source)
Installing MASS (7.3-53) ... 
    OK (built source)
Installing R.methodsS3 (1.8.1) ... 
    OK (built source)
Installing R6 (2.5.0) ... 
    OK (built source)
Installing Rcpp (1.0.6) ... 
    OK (built source)
Installing Rsymphony (0.1-29) ... 
    OK (built source)
Installing assertthat (0.2.1) ... 
    OK (built source)
Installing base64enc (0.1-3) ... 
    OK (built source)
Installing bitops (1.0-6) ... 
    OK (built source)
Installing clipr (0.7.1) ... 
    OK (built source)
Installing clisymbols (1.2.0) ... 
    OK (built source)
Installing cplexAPI (1.4.0) ... 
[1] "Command failed (1)\n\nFailed to run system command:\n\n\t'/usr/lib/R/bin/R' --vanilla CMD INSTALL '/tmp/RtmpAMtz8N/cplexAPI' --library='/media/richard/DATA/Work/R/benchmark-master/packrat/lib/x86_64-pc-linux-gnu/4.0.4' --install-tests --no-docs --no-multiarch --no-demo \n\nThe command failed with output:\n* installing *source* package 'cplexAPI' ...\n** package 'cplexAPI' successfully unpacked and MD5 sums checked\n** using staged installation\nchecking for gcc... gcc -std=gnu99\nchecking whether the C compiler works... yes\nchecking for C compiler default output file name... a.out\nchecking for suffix of executables... \nchecking whether we are cross compiling... no\nchecking for suffix of object files... o\nchecking whether we are using the GNU C compiler... yes\nchecking whether gcc -std=gnu99 accepts -g... yes\nchecking for gcc -std=gnu99 option to accept ISO C89... none needed\nchecking how to run the C preprocessor... gcc -std=gnu99 -E\nchecking for cplex... NONE\nconfigure: NOTICE NONE ermittelt\nconfigure: error: CPLEX interactive optimizer not found\nERROR: configuration failed for package 'cplexAPI'\n* removing '/media/richard/DATA/Work/R/benchmark-master/packrat/lib/x86_64-pc-linux-gnu/4.0.4/cplexAPI'"
Error: Command failed (1)

Failed to run system command:

    '/usr/lib/R/bin/R' --vanilla CMD INSTALL '/tmp/RtmpAMtz8N/cplexAPI' --library='/media/richard/DATA/Work/R/benchmark-master/packrat/lib/x86_64-pc-linux-gnu/4.0.4' --install-tests --no-docs --no-multiarch --no-demo 

The command failed with output:
* installing *source* package 'cplexAPI' ...
** package 'cplexAPI' successfully unpacked and MD5 sums checked
** using staged installation
checking for gcc... gcc -std=gnu99
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for cplex... NONE
configure: NOTICE NONE ermitt
In addition: There were 29 warnings (use warnings() to see them)
No traceback available 
jeffreyhanson commented 3 years ago

Ok excellent - looks like we resolved the symlink issue. Hmm, checking for cplex... NONE would suggest that it can't find the CPLEX on your system. Have you set the CPLEX_BIN environmental variable in your .bashrc file? You should have a line in it that looks something like this:

export CPLEX_BIN="/opt/ibm/ILOG/CPLEX_Studio128/cplex/bin/x86-64_linux/cplex"

(Note your version number might be different)

ricschuster commented 3 years ago

Fixed the export issue and cplexAPI installs now, but the symlink issue with gurobi persists:


R version 4.0.4 (2021-02-15) -- "Lost Library Book"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Packrat is not installed in the local library -- attempting to bootstrap an installation...
> Installing packrat into project private library:
- 'packrat/lib/x86_64-pc-linux-gnu/4.0.4'
* installing *source* package ‘packrat’ ...
** package ‘packrat’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (packrat)
> Attaching packrat
> Restoring library
Installing BH (1.75.0-0) ... 
    OK (built source)
Installing DBI (1.1.1) ... 
    OK (built source)
Installing KernSmooth (2.23-18) ... 
    OK (built source)
Installing MASS (7.3-53) ... 
    OK (built source)
Installing R.methodsS3 (1.8.1) ... 
    OK (built source)
Installing R6 (2.5.0) ... 
    OK (built source)
Installing Rcpp (1.0.6) ... 
    OK (built source)
Installing Rsymphony (0.1-29) ... 
    OK (built source)
Installing assertthat (0.2.1) ... 
    OK (built source)
Installing base64enc (0.1-3) ... 
    OK (built source)
Installing bitops (1.0-6) ... 
    OK (built source)
Installing clipr (0.7.1) ... 
    OK (built source)
Installing clisymbols (1.2.0) ... 
    OK (built source)
Installing cplexAPI (1.4.0) ... 
    OK (built source)
Installing cpp11 (0.2.6) ... 
    OK (built source)
Installing crayon (1.4.1) ... 
    OK (built source)
Installing curl (4.3) ... 
    OK (built source)
Installing data.table (1.14.0) ... 
    OK (built source)
Installing digest (0.6.27) ... 
    OK (built source)
Installing evaluate (0.14) ... 
    OK (built source)
Installing fansi (0.4.2) ... 
    OK (built source)
Installing fastmap (1.1.0) ... 
    OK (built source)
Installing fs (1.5.0) ... 
    OK (built source)
Installing generics (0.1.0) ... 
    OK (built source)
Installing git2r (0.28.0) ... 
    OK (built source)
Installing gitcreds (0.1.1) ... 
    OK (built source)
Installing glue (1.4.2) ... 
    OK (built source)
Installing highr (0.8) ... 
    OK (built source)
Installing ini (0.3.1) ... 
    OK (built source)
Installing iterators (1.0.13) ... 
    OK (built source)
Installing jsonlite (1.7.2) ... 
    OK (built source)
Installing lpsymphony (1.17.0) ... 
    OK (built source)
Installing magrittr (2.0.1) ... 
    OK (built source)
Installing mime (0.9) ... 
    OK (built source)
Installing pkgconfig (2.0.3) ... 
    OK (built source)
Installing proto (1.0.0) ... 
    OK (built source)
Installing rappdirs (0.3.3) ... 
    OK (built source)
Installing remotes (2.2.0) ... 
    OK (built source)
Installing rlang (0.4.10) ... 
    OK (built source)
Installing rprojroot (2.0.2) ... 
    OK (built source)
Installing rstudioapi (0.13) ... 
    OK (built source)
Installing session (1.0.3) ... 
    OK (built source)
Installing sp (1.4-5) ... 
    OK (built source)
Installing stringi (1.5.3) ... 
    OK (built source)
Installing sys (3.4) ... 
    OK (built source)
Installing utf8 (1.1.4) ... 
    OK (built source)
Installing uuid (0.1-4) ... 
    OK (built source)
Installing whisker (0.4) ... 
    OK (built source)
Installing withr (2.4.1) ... 
    OK (built source)
Installing xfun (0.19) ... 
    OK (built source)
Installing yaml (2.2.1) ... 
    OK (built source)
Installing zip (2.1.1) ... 
    OK (built source)
Installing class (7.3-17) ... 
    OK (built source)
Installing R.oo (1.24.0) ... 
    OK (built source)
Installing RcppArmadillo (0.10.2.1.0) ... 
    OK (built source)
Installing RcppTOML (0.1.7) ... 
    OK (built source)
Installing ape (5.4-1) ... 
    OK (built source)
Installing plyr (1.8.6) ... 
    OK (built source)
Installing units (0.6-7) ... 
    OK (built source)
Installing rcbc (0.1.0.9001) ... 
    OK (built source)
Installing caTools (1.18.0) ... 
    OK (built source)
Installing lubridate (1.7.9.2) ... 
    OK (built source)
Installing cli (2.3.1) ... 
    OK (built source)
Installing foreach (1.5.1) ... 
    OK (built source)
Installing igraph (1.2.6) ... 
    OK (built source)
Installing cachem (1.0.4) ... 
    OK (built source)
Installing ellipsis (0.3.1) ... 
    OK (built source)
Installing htmltools (0.5.1.1) ... 
    OK (built source)
Installing lifecycle (1.0.0) ... 
    OK (built source)
Installing purrr (0.3.4) ... 
    OK (built source)
Installing desc (1.2.0) ... 
    OK (built source)
Installing raster (3.4-5) ... 
    OK (built source)
Installing rgdal (1.5-23) ... 
    OK (built source)
Installing rgeos (0.5-5) ... 
    OK (built source)
Installing stringr (1.4.0) ... 
    OK (built source)
Installing askpass (1.1) ... 
    OK (built source)
Installing markdown (1.1) ... 
    OK (built source)
Installing e1071 (1.7-4) ... 
    OK (built source)
Installing R.utils (2.10.1) ... 
    OK (built source)
Installing doParallel (1.0.16) ... 
    OK (built source)
Installing memoise (2.0.0) ... 
    OK (built source)
Installing vctrs (0.3.6) ... 
    OK (built source)
Installing fasterize (1.0.3) ... 
    OK (built source)
Installing openssl (1.4.3) ... 
    OK (built source)
Installing knitr (1.30) ... 
    OK (built source)
Installing classInt (0.4-3) ... 
    OK (built source)
Installing gdalUtils (2.0.3.2) ... 
    OK (built source)
Installing hms (1.0.0) ... 
    OK (built source)
Installing pillar (1.5.0) ... 
    OK (built source)
Installing tidyselect (1.1.0) ... 
    OK (built source)
Installing credentials (1.3.0) ... 
    OK (built source)
Installing httr (1.4.2) ... 
    OK (built source)
Installing rmarkdown (1.2) ... 
    OK (built source)
Installing sf (0.9-7) ... 
    OK (built source)
Installing tibble (3.0.6) ... 
    OK (built source)
Installing gert (1.2.0) ... 
    OK (built source)
Installing gh (1.2.0) ... 
    OK (built source)
Installing exactextractr (0.5.1) ... 
    OK (built source)
Installing dplyr (1.0.2) ... 
    OK (built source)
Installing readr (1.4.0) ... 
    OK (built source)
Installing usethis (2.0.1) ... 
    OK (built source)
Installing prioritizr (6.0.0.2) ... 
    OK (built source)
Installing piggyback (0.0.11) ... 
    OK (built source)
There were 50 or more warnings (use warnings() to see the first 50)
Warning messages:
1: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/gurobi' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/gurobi', reason 'Function not implemented'
2: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/slam' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/slam', reason 'Function not implemented'
3: In symlinkExternalPackages(project = project) :
  The following external packages could not be linked into the packrat private library:
- 'gurobi', 'slam'
> local({
+ 
+   ## Helper function to get the path to the library directory for a
+   ## given packrat project.
+   getPackratLibDir <- function(projDir = NULL) {
+     path <- file.path("packrat", "lib", R.version$platform, getRversion())
+ 
+     if (!is.null(projDir)) {
+ 
+       ## Strip trailing slashes if necessary
+       projDir <- sub("/+$", "", projDir)
+ 
+       ## Only prepend path if different from current working dir
+       if (!identical(normalizePath(projDir), normalizePath(getwd())))
+         path <- file.path(projDir, path)
+     }
+ 
+     path
+   }
+ 
+   ## Ensure that we set the packrat library directory relative to the
+   ## project directory. Normally, this should be the working directory,
+   ## but we also use '.rs.getProjectDirectory()' if necessary (e.g. we're
+   ## rebuilding a project while within a separate directory)
+   libDir <- if (exists(".rs.getProjectDirectory"))
+     getPackratLibDir(.rs.getProjectDirectory())
+   else
+     getPackratLibDir()
+ 
+   ## Unload packrat in case it's loaded -- this ensures packrat _must_ be
+   ## loaded from the private library. Note that `requireNamespace` will
+   ## succeed if the package is already loaded, regardless of lib.loc!
+   if ("packrat" %in% loadedNamespaces())
+     try(unloadNamespace("packrat"), silent = TRUE)
+ 
+   if (suppressWarnings(requireNamespace("packrat", quietly = TRUE, lib.loc = libDir))) {
+ 
+     # Check 'print.banner.on.startup' -- when NA and RStudio, don't print
+     print.banner <- packrat::get_opts("print.banner.on.startup")
+     if (print.banner == "auto" && is.na(Sys.getenv("RSTUDIO", unset = NA))) {
+       print.banner <- TRUE
+     } else {
+       print.banner <- FALSE
+     }
+     return(packrat::on(print.banner = print.banner))
+   }
+ 
+   ## Escape hatch to allow RStudio to handle bootstrapping. This
+   ## enables RStudio to provide print output when automagically
+   ## restoring a project from a bundle on load.
+   if (!is.na(Sys.getenv("RSTUDIO", unset = NA)) &&
+       is.na(Sys.getenv("RSTUDIO_PACKRAT_BOOTSTRAP", unset = NA))) {
+     Sys.setenv("RSTUDIO_PACKRAT_BOOTSTRAP" = "1")
+     setHook("rstudio.sessionInit", function(...) {
+       # Ensure that, on sourcing 'packrat/init.R', we are
+       # within the project root directory
+       if (exists(".rs.getProjectDirectory")) {
+         owd <- getwd()
+         setwd(.rs.getProjectDirectory())
+         on.exit(setwd(owd), add = TRUE)
+       }
+       source("packrat/init.R")
+     })
+     return(invisible(NULL))
+   }
+ 
+   ## Bootstrapping -- only performed in interactive contexts,
+   ## or when explicitly asked for on the command line
+   if (interactive() || "--bootstrap-packrat" %in% commandArgs(TRUE)) {
+ 
+     needsRestore <- "--bootstrap-packrat" %in% commandArgs(TRUE)
+ 
+     message("Packrat is not installed in the local library -- ",
+             "attempting to bootstrap an installation...")
+ 
+     ## We need utils for the following to succeed -- there are calls to functions
+     ## in 'restore' that are contained within utils. utils gets loaded at the
+     ## end of start-up anyhow, so this should be fine
+     library("utils", character.only = TRUE)
+ 
+     ## Install packrat into local project library
+     packratSrcPath <- list.files(full.names = TRUE,
+                                  file.path("packrat", "src", "packrat")
+     )
+ 
+     ## No packrat tarballs available locally -- try some other means of installation
+     if (!length(packratSrcPath)) {
+ 
+       message("> No source tarball of packrat available locally")
+ 
+       ## There are no packrat sources available -- try using a version of
+       ## packrat installed in the user library to bootstrap
+       if (requireNamespace("packrat", quietly = TRUE) && packageVersion("packrat") >= "0.2.0.99") {
+         message("> Using user-library packrat (",
+                 packageVersion("packrat"),
+                 ") to bootstrap this project")
+       }
+ 
+       ## Couldn't find a user-local packrat -- try finding and using devtools
+       ## to install
+       else if (requireNamespace("devtools", quietly = TRUE)) {
+         message("> Attempting to use devtools::install_github to install ",
+                 "a temporary version of packrat")
+         library(stats) ## for setNames
+         devtools::install_github("rstudio/packrat")
+       }
+ 
+       ## Try downloading packrat from CRAN if available
+       else if ("packrat" %in% rownames(available.packages())) {
+         message("> Installing packrat from CRAN")
+         install.packages("packrat")
+       }
+ 
+       ## Fail -- couldn't find an appropriate means of installing packrat
+       else {
+         stop("Could not automatically bootstrap packrat -- try running ",
+              "\"'install.packages('devtools'); devtools::install_github('rstudio/packrat')\"",
+              "and restarting R to bootstrap packrat.")
+       }
+ 
+       # Restore the project, unload the temporary packrat, and load the private packrat
+       if (needsRestore)
+         packrat::restore(prompt = FALSE, restart = TRUE)
+ 
+       ## This code path only reached if we didn't restart earlier
+       unloadNamespace("packrat")
+       requireNamespace("packrat", lib.loc = libDir, quietly = TRUE)
+       return(packrat::on())
+ 
+     }
+ 
+     ## Multiple packrat tarballs available locally -- try to choose one
+     ## TODO: read lock file and infer most appropriate from there; low priority because
+     ## after bootstrapping packrat a restore should do the right thing
+     if (length(packratSrcPath) > 1) {
+       warning("Multiple versions of packrat available in the source directory;",
+               "using packrat source:\n- ", shQuote(packratSrcPath))
+       packratSrcPath <- packratSrcPath[[1]]
+     }
+ 
+ 
+     lib <- file.path("packrat", "lib", R.version$platform, getRversion())
+     if (!file.exists(lib)) {
+       dir.create(lib, recursive = TRUE)
+     }
+ 
+     message("> Installing packrat into project private library:")
+     message("- ", shQuote(lib))
+ 
+     surround <- function(x, with) {
+       if (!length(x)) return(character())
+       paste0(with, x, with)
+     }
+ 
+ 
+     ## Invoke install.packages() in clean R session
+     peq <- function(x, y) paste(x, y, sep = " = ")
+     installArgs <- c(
+       peq("pkgs", surround(packratSrcPath, with = "'")),
+       peq("lib", surround(lib, with = "'")),
+       peq("repos", "NULL"),
+       peq("type", surround("source", with = "'"))
+     )
+ 
+     fmt <- "utils::install.packages(%s)"
+     installCmd <- sprintf(fmt, paste(installArgs, collapse = ", "))
+ 
+     ## Write script to file (avoid issues with command line quoting
+     ## on R 3.4.3)
+     installFile <- tempfile("packrat-bootstrap", fileext = ".R")
+     writeLines(installCmd, con = installFile)
+     on.exit(unlink(installFile), add = TRUE)
+ 
+     fullCmd <- paste(
+       surround(file.path(R.home("bin"), "R"), with = "\""),
+       "--vanilla",
+       "--slave",
+       "-f",
+       surround(installFile, with = "\"")
+     )
+     system(fullCmd)
+ 
+     ## Tag the installed packrat so we know it's managed by packrat
+     ## TODO: should this be taking information from the lockfile? this is a bit awkward
+     ## because we're taking an un-annotated packrat source tarball and simply assuming it's now
+     ## an 'installed from source' version
+ 
+     ## -- InstallAgent -- ##
+     installAgent <- "InstallAgent: packrat 0.5.0"
+ 
+     ## -- InstallSource -- ##
+     installSource <- "InstallSource: source"
+ 
+     packratDescPath <- file.path(lib, "packrat", "DESCRIPTION")
+     DESCRIPTION <- readLines(packratDescPath)
+     DESCRIPTION <- c(DESCRIPTION, installAgent, installSource)
+     cat(DESCRIPTION, file = packratDescPath, sep = "\n")
+ 
+     # Otherwise, continue on as normal
+     message("> Attaching packrat")
+     library("packrat", character.only = TRUE, lib.loc = lib)
+ 
+     message("> Restoring library")
+     if (needsRestore)
+       packrat::restore(prompt = FALSE, restart = FALSE)
+ 
+     # If the environment allows us to restart, do so with a call to restore
+     restart <- getOption("restart")
+     if (!is.null(restart)) {
+       message("> Packrat bootstrap successfully completed. ",
+               "Restarting R and entering packrat mode...")
+       return(restart())
+     }
+ 
+     # Callers (source-erers) can define this hidden variable to make sure we don't enter packrat mode
+     # Primarily useful for testing
+     if (!exists(".__DONT_ENTER_PACKRAT_MODE__.") && interactive()) {
+       message("> Packrat bootstrap successfully completed. Entering packrat mode...")
+       packrat::on()
+     }
+ 
+     Sys.unsetenv("RSTUDIO_PACKRAT_BOOTSTRAP")
+ 
+   }
+ 
+ })
Error in library(x, character.only = TRUE, quietly = TRUE) : 
  there is no package called ‘gurobi’
Calls: local ... setPackratModeOn -> afterPackratModeOn -> lapply -> FUN -> library
In addition: Warning messages:
1: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/gurobi' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/gurobi', reason 'Function not implemented'
2: In file.symlink(from, to) :
  cannot symlink '/usr/local/lib/R/site-library/slam' to '/media/richard/DATA/Work/R/benchmark-master/packrat/lib-ext/x86_64-pc-linux-gnu/4.0.4/slam', reason 'Function not implemented'
3: In symlinkExternalPackages(project = project) :
  The following external packages could not be linked into the packrat private library:
- 'gurobi', 'slam'
No traceback available 
ricschuster commented 3 years ago

I just ran into an issue with pu_size_factor of 5.

02-prepare-data.R throws an error on line 53 assertthat::assert_that(all(!is.na(r2[idx])))

Error: Elements 850105, 850895, 851686, 852473, 853262, ... of !is.na(r2[idx]) are not true

Could you have a look @jeffreyhanson ?

jeffreyhanson commented 3 years ago

Ah - thanks for letting me know. Yeah, I'll start looking into it now.

jeffreyhanson commented 3 years ago

I've just pushed a commit which should fix this issue, could you please see if it works now?

jeffreyhanson commented 3 years ago

Woops, somehow got a merge conflict. Give me a minute

jeffreyhanson commented 3 years ago

Ok - could you please try it now?

jeffreyhanson commented 3 years ago

Just spotted another typo, should be good now

ricschuster commented 3 years ago

Thanks Jeff!

I'm still gettting Elements 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 of !is.na(r2[idx]) are not true for both debug and release. Maybe I'm doing something wrong?

Does make data/intermediate/02-*.rda complete for you?

jeffreyhanson commented 3 years ago

Ah - sorry, I didn't realize you'd updated the resolutions. It worked for a resolution of 500 for me. I'll pull the latest version and take a look.

jeffreyhanson commented 3 years ago

Just to check, did you run make clean after pulling the latest commit?

ricschuster commented 3 years ago

I did. For me 500 did not work. 100 and 200 worked before. 500 and higher did not. Not sure why. Maybe its something on my end?

jeffreyhanson commented 3 years ago

Hmm, might have something to with GDAL versions. What GDAL version are you using? E.g.

gdalwarp --version
# GDAL 2.2.2, released 2017/09/15
jeffreyhanson commented 3 years ago

Can you send me the 02-prepare-data.Rout log file so I can compare against mine?

jeffreyhanson commented 3 years ago

Specifically, that log file after running with the debug parameters.

jeffreyhanson commented 3 years ago

I'll push a commit to print some information the resolutions so we can see exactly where it stops.

ricschuster commented 3 years ago

Here the log:


R version 4.0.4 (2021-02-15) -- "Lost Library Book"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # restore session
> session::restore.session(session_path("01"))
Loading all data...
Loading packages...
Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Restoring search path...
Done.
> 
> # set raster processing options
> raster::rasterOptions(
+   maxmemory = general_parameters$raster_maxmemory,
+   chunksize = general_parameters$raster_chunksize)
> 
> # import parameters
> data_parameters <-
+   RcppTOML::parseTOML("code/parameters/data.toml")[[MODE]]
> 
> # import data
> full_pu_data <- readRDS(full_pu_data_path)
> full_pu_raster_data <-
+   full_pu_raster_data_path %>%
+   raster::raster()
> 
> # generate different sized planning units for benchmark analysis
> ## calculate aggregation factors
> pu_size_factors <-
+   (data_parameters$planning_unit_size) /
+   raster::xres(full_pu_raster_data)
> pu_size_factors <- round(pu_size_factors) # avoid floating point issues
> 
> ## validate aggregation factors
> assertthat::assert_that(all(pu_size_factors >= 1))
[1] TRUE
> 
> ## find indices of planning units in full dataset
> idx <- raster::Which(!is.na(full_pu_raster_data), cells = TRUE)
> 
> ## set non-planning unit indices to -1 so that aggregating/disaggregating
> ## data works
> full_pu_raster_data[raster::Which(is.na(full_pu_raster_data))] <- -1
> 
> ## create a list with the planning unit data at different resolutions
> ## each list contains a list of containing the planning unit data
> ## in raster format and also data.frame format
> pu_output <-
+   pu_size_factors %>%
+   plyr::llply(.progress = "text", function(x) {
+     ## create aggregated dataset
+     r <- gdal_aggregate_raster(full_pu_raster_data, fact = x, "max")
+     ## assign new planning unit ids
+     ids <- raster::Which(r > 0, cells = TRUE)
+     r[ids] <- ids
+     ## disaggregate raster to match resolution original raster
+     r2 <- gdal_disaggregate_raster(r, fact = x, "max")
+     ## crop raster to match original raster
+     r2 <- raster::crop(r2, raster::extent(full_pu_raster_data))
+     r2[raster::Which(r2 < 0)] <- NA_real_
+     raster::compareRaster(r2, full_pu_raster_data)
+     assertthat::assert_that(all(!is.na(r2[idx])))
+     ## create planning unit data for given resolution, including spp data
+     d <-
+       full_pu_data %>%
+       dplyr::mutate(pu = c(r2[idx])) %>%
+       dplyr::group_by(pu) %>%
+       dplyr::summarize_all(sum) %>%
+       dplyr::ungroup()
+     ## validate result
+     assertthat::assert_that(
+       nrow(d) == length(ids),
+       msg = "failed to create dataset with different resolution")
+     ## create raster to store planning units
+     r <- raster::setValues(r, NA_real_)
+     r[ids] <- 1
+     ## return result
+     list(data = d, raster = r)
+   })

  |                                                                            
  |                                                                      |   0%Checking gdal_installation...
Scanning for GDAL installations...
Checking Sys.which...
GDAL version 3.2.1
GDAL command being used: "/usr/bin/gdalwarp" -overwrite  -te 360550.809907823 4991826.04905745 603050.809907823 5540826.04905745 -te_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -tr 500 500 -s_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -t_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -r "max" -srcnodata "-9999" -dstnodata "-9999" -of "GTiff" -co "COMPRESS=LZW" "/tmp/RtmpkfsxIw/file101e1d65a6fd0.tif" "/tmp/RtmpkfsxIw/file101e1d89cd96d.tif"
Checking gdal_installation...
GDAL version 3.2.1
GDAL command being used: "/usr/bin/gdalwarp" -overwrite  -te 360550.809907823 4991826.04905745 603050.809907823 5540826.04905745 -te_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -tr 100 100 -s_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -t_srs "+proj=utm +zone=10 +datum=NAD27 +units=m +no_defs" -r "max" -srcnodata "-9999" -dstnodata "-9999" -of "GTiff" -co "COMPRESS=LZW" "/tmp/RtmpkfsxIw/file101e1d68afd1bc.tif" "/tmp/RtmpkfsxIw/file101e1d828e466.tif"
Error: Elements 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 of !is.na(r2[idx]) are not true
No traceback available 
ricschuster commented 3 years ago

When I step through the script, this is where the error message is generated:

assertthat::assert_that(all(!is.na(r2[idx])))

jeffreyhanson commented 3 years ago

Hmm, it seems like you are using a more recent version of GDAL than me. Strangely, it seems like all the planning units are assigned NA values on your system. One of the Carleton servers has GDAL 3.0.4 (not same as yours, but closer), I'll see if I can reproduce the issue there.

jeffreyhanson commented 3 years ago

We could just use raster::aggregate and raster::disggregate but that will (probably) be a lot slower (given the size of the dataset).

ricschuster commented 3 years ago

Do you have a sense of why 100 and 200 work, but nothing larger? What GDAL version are you using?

jeffreyhanson commented 3 years ago

Nah, I don't really know why. When I took a look at this earlier, it was because the code for gdal_aggregate was incorrectly estimating the extent for the otuput raster. I think I've fixed that though, so it shouldn't be an issue anymore. My computer has 2.2.2.

jeffreyhanson commented 3 years ago

Can we verify that it definitely has loaded the correct gdal_aggregate_raster function definition? If you open an interactive R session (make open), run this code (session::restore.session(session_path("01"))), and then enter gdal_aggregate_raster? Can you see this text in the function definition:

  n_res <- raster::res(x) * fact
  n_ext <- list(
    xmin = raster::xmin(x),
    ymin = raster::ymin(x))
  n_ext$xmax <-
    n_ext$xmin + (ceiling(ncol(x) / fact) * n_res[[1]])
  n_ext$ymax <-
    n_ext$ymin + (ceiling(nrow(x) / fact) * n_res[[2]])
ricschuster commented 3 years ago

Yes. Just to make sure, here the function definition I see:

function (x, fact, method = "average") 
{
    assertthat::assert_that(inherits(x, "Raster"), assertthat::is.count(fact), 
        assertthat::is.count(fact))
    f1 <- tempfile(fileext = ".tif")
    f2 <- tempfile(fileext = ".tif")
    n_res <- raster::res(x) * fact
    n_ext <- list(xmin = raster::xmin(x), ymin = raster::ymin(x))
    n_ext$xmax <- n_ext$xmin + (ceiling(ncol(x)/fact) * n_res[[1]])
    n_ext$ymax <- n_ext$ymin + (ceiling(nrow(x)/fact) * n_res[[2]])
    raster::writeRaster(x, f1, NAflag = -9999, overwrite = TRUE)
    out <- raster::readAll(gdalUtils::gdalwarp(srcfile = f1, 
        dstfile = f2, s_srs = x@crs@projargs, t_srs = x@crs@projargs, 
        te = c(n_ext$xmin, n_ext$ymin, n_ext$xmax, n_ext$ymax), 
        te_srs = x@crs@projargs, tr = n_res, r = method, srcnodata = -9999, 
        dstnodata = -9999, of = "GTiff", co = "COMPRESS=LZW", 
        overwrite = TRUE, output_Raster = TRUE, verbose = TRUE))
    unlink(f1, force = TRUE)
    unlink(f2, force = TRUE)
    out
}
jeffreyhanson commented 3 years ago

Hmm, ok, well I don't know why it's not working. I'm still waiting for packrat to install all the libs on the Carleton server, so it'll be a while before I can test it there.

ricschuster commented 3 years ago

Sorry about this Jeff. Maybe I should try with a clean version from GitHub. I can easily see how it could be related to GDAL.

jeffreyhanson commented 3 years ago

No worries - I suppose this is to be expected since we're not using something like Docker to handle every possible dependency.

jeffreyhanson commented 3 years ago

If you open an interactive session, how long does it take to aggregate and disaggregate the raster? E.g. something like:

# restore session
session::restore.session(session_path("01"))

# set raster processing options
raster::rasterOptions(
  maxmemory = general_parameters$raster_maxmemory,
  chunksize = general_parameters$raster_chunksize)

# load data
full_pu_raster_data <-
  full_pu_raster_data_path %>%
  raster::raster()

# set Nas to -1
full_pu_raster_data[raster::Which(is.na(full_pu_raster_data))] <- -1

# try aggregating data
system.time({
a = raster::aggregate(full_pu_raster_data, fact = 20, fun = min)
})

# try disaggregating data
system.time({
b = raster::disaggregate(a, fact = 20, fun = min)
})
ricschuster commented 3 years ago

6 seconds 1.5 seconds

That's for debug.

ricschuster commented 3 years ago

Would docker work for setup, but adding gurobi once the docker image is loaded? (I don't know much about docker to be honest)

jeffreyhanson commented 3 years ago

Ok - that's not too bad. Could you try this again with release settings? E.g. run make clean, make data/intermediate/01*.rda, and then run that code in an interactive session?

jeffreyhanson commented 3 years ago

Yeah, Docker could be used to standardize system dependencies between our two systems (thus we would have the same version of GDAL and I could reproduce the issue locally). However, it doesn't work with Gurobi, so we can't run the analysis inside a Docker container.

jeffreyhanson commented 3 years ago

I suppose I could try finding a Docker image with the same GDAL version as on your computer, and that might help me reproduce the issue? I'll have a quick look.

ricschuster commented 3 years ago

Yeah, Docker could be used to standardize system dependencies between our two systems (thus we would have the same version of GDAL and I could reproduce the issue locally). However, it doesn't work with Gurobi, so we can't run the analysis inside a Docker container.

Too bad.

I am switching to release now.

ricschuster commented 3 years ago

I suppose I could try finding a Docker image with the same GDAL version as on your computer, and that might help me reproduce the issue? I'll have a quick look.

Any potential that Python could be the issue? I've been having issues with Python lately and might have messed up something.

jeffreyhanson commented 3 years ago

Maybe - I think that GDAL has some integration with Python. I don't know enough to say whether that's likely to be the issue or not.

ricschuster commented 3 years ago

release data times: 8.4 seconds 0.1 seconds (which is a bit suspicious)

ricschuster commented 3 years ago

Maybe not too suspicious given the release raster options

ricschuster commented 3 years ago

Those times indicate that raster::aggregate might work for us. What do you think?

jeffreyhanson commented 3 years ago

Yeah I agree. I'll replace the GDAL stuff with raster::aggregate and raster::disaggregate.

jeffreyhanson commented 3 years ago

I've made the change locally, but I'll just verify that it actually works before pushing it