PredictiveEcology / Require

Simple R package to install and load packages conducive to a reproducible workflow
https://require.predictiveecology.org/
21 stars 6 forks source link

pkgSnapshot writes empty file #93

Closed jfiksel closed 3 months ago

jfiksel commented 1 year ago

In the vignette, it states that the pkgSnapshot() function is supposed to act exactly like the renv::snapshot() function. However, when I call pkgSnapshot(), it writes out a file with the headers "Package","LibPath","Version", but no content underneath. When I call pkgSnapshot2(), it returns a vector with every single package in my library, regardless of whether I'm using it specifically in my current project. This doesn't seem like the behavior I would expect, or if there's something wrong with the functions. Thanks!

eliotmcintire commented 1 year ago

Hi @jfiksel . Not sure I can do anything about this. pkgSnapshot should do what you expect. You maybe need to specify your .libPaths() or libPath argument with standAlone = TRUE?

These issues are hard to reproduce, I am aware, because they are very dependent on your system configuration.

achubaty commented 1 year ago

I can confirm this does not work as expected with latest development version of Require (0.3.1.9007):

library(Require)

td = tempdir2("myProjLib", tempdir())

.libPaths(td, include.site = FALSE)

install.packages("data.table")

nrow(installed.packages()) ## reference point

# reprex 1

tf = tempfile(fileext = ".txt")
pkgSnapshot(tf)
pkgs = read.csv(tf)

tf2 = tempfile(fileext = ".txt")
pkgs2 = pkgSnapshot2(tf2)

identical(nrow(pkgs), length(pkgs2)) ## FALSE (!)

# reprex 2

tf3 = tempfile(fileext = ".txt")
pkgSnapshot(tf3, libPaths = .libPaths())
pkgs3 = read.csv(tf3)
identical(nrow(pkgs3), 1L) ## FALSE (!)

The above examples also fail with standAlone = TRUE.

achubaty commented 1 year ago

the issue appears to be with checkLibPaths(), which is appending the R version to.libPaths() instead of using those library paths as is:

> Require::pkgSnapshot("packages_latest.txt")
debugging in: Require::pkgSnapshot("packages_latest.txt")
debug at /tmp/RtmpY3XoUF/R.INSTALL3ebcb57c29104c/PredictiveEcology-Require-55ec169/R/pkgSnapshot.R#84: {
    libPaths <- checkLibPaths(libPaths = libPaths)
    libPaths <- doLibPaths(libPaths, standAlone)
    ip <- doInstalledPackages(libPaths, purge, includeBase)
    fwrite(ip, file = packageVersionFile, row.names = FALSE, 
        na = NA)
    messageVerbose("package version file saved in ", packageVersionFile, 
        verbose = verbose, verboseLevel = 1)
    return(invisible(ip))
}
Browse[2]> 
debug at /tmp/RtmpY3XoUF/R.INSTALL3ebcb57c29104c/PredictiveEcology-Require-55ec169/R/pkgSnapshot.R#91: libPaths <- checkLibPaths(libPaths = libPaths)
Browse[2]> libPaths
[1] "/home/achubaty/myProject/renv/library/R-4.3/x86_64-pc-linux-gnu"
[2] "/home/achubaty/.cache/R/renv/sandbox/R-4.3/x86_64-pc-linux-gnu/9a444a72"               
Browse[2]>     libPaths <- checkLibPaths(libPaths = libPaths)
Browse[2]> libPaths
[1] "/home/achubaty/myProject/renv/library/R-4.3/x86_64-pc-linux-gnu/4.3"
[2] "/home/achubaty/.cache/R/renv/sandbox/R-4.3/x86_64-pc-linux-gnu/9a444a72/4.3" 
eliotmcintire commented 1 year ago

It is unclear why appending the R version is the wrong thing to do. We need the R version to be part of the .libPaths() or it will conflict with other .libPaths() on the same machine that use different R. And it will also potentially install the wrong build of a package for the given R version. How would removing this information from the .libPaths() be the correct option?

achubaty commented 1 year ago

At a project level and for Require-managed lib paths, yes, Require should set up the library path to include the version and architecture. But in this case, where any arbitrary path is being passed by the user, we shouldn't modify what the user provides.

eliotmcintire commented 1 year ago

We actually used to do this and determined it was the wrong behaviour. It must have the R version number or numerous things in R will fail... they are expecting version number. There are several complications that this will create. I believe this requires a different solution. I can't look at a different solution for several weeks.

achubaty commented 1 year ago

For Require-managed project setup, it makes sense to include the R version in the path. But not for a user-called function - the user knows the path they specified, so we can use it.

The PR only affects pkgSnapshot().

Where that function is not called directly by the user, the versioned path can still be passed and subsequently used downstream.

eliotmcintire commented 1 year ago

The reprex 1 above does not fail for me. It works as expected. I am using Ubuntu 22.04,

> getRversion()
[1] ‘4.3.0’
> packageVersion("Require")
[1] ‘0.3.1.9007’

> NROW(pkgs)
[1] 177
> NROW(pkgs2)
[1] 177

reprex 2 doesn't work because you can't install data.table while Require is loaded. If I pick a different package to install, it works as expected.

# reprex 2a
library(Require)
td = tempdir2("myProjLib", tempdir())

.libPaths(td, include.site = FALSE)
install.packages("testit")

tf3 = tempfile(fileext = ".txt")
library(Require)
pkgSnapshot(tf3, libPaths = .libPaths())
pkgs3 = read.csv(tf3)
identical(nrow(pkgs3), 1) ## FALSE (!)

I see that the pkgSnapshot is appending the R version number to the libPaths arg. I do not recall which base R functions automatically do this same thing, but Require package is emulating that/those. So, we need to re-find the base-R functions that make this appending and we need to understand it better to understand when/where we can deviate.

We previously had the function not appending the R version number, but it will break base R functionality elsewhere if we do not. This is not about "Require-managed". Require emulates base R.

achubaty commented 1 year ago

Base R appends version etc. for the default user library path and sets the default .libPaths() accordingly on startup, but does not modify the path passed to e.g., install.packages() nor will it modify a user-provided path when changing .libPaths().

The reason reprex 1 worked for you is because you were using the default library path (which is versioned). Sorry I didn't include an explicit .libPaths() call there -- I've edited above.

Installing data.table to a different lib path does work when Require is loaded from somewhere else, which it is in the example above. Regardless of which package you try installing, note that neither .libPaths() nor install.packages() modify the user-provided path.

eliotmcintire commented 1 year ago

Installing data.table to a different lib path does work when Require is loaded from somewhere else, which it is in the example above.

This is not correct on my system, where Rstudio will not allow the install.

achubaty commented 1 year ago
> library(Require)
Require version: 0.3.1.9007
  Using cache directory: /home/achubaty/.cache/R/Require/packages/4.3; clear with clearRequirePackageCache().
  See ?RequireOptions for this and other settings.
> td = file.path(tempdir(), "myLib")
> dir.create(td, recursive = TRUE)
> install.packages("data.table", lib = td)
trying URL 'https://packagemanager.posit.co/all/__linux__/focal/latest/src/contrib/data.table_1.14.8.tar.gz'
Content type 'binary/octet-stream' length 5275174 bytes (5.0 MB)
==================================================
downloaded 5.0 MB

* installing *source* package ‘data.table’ ...
** package ‘data.table’ successfully unpacked and MD5 sums checked
...<snip>...
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)

The downloaded source packages are in
    ‘/tmp/Rtmp6ozGEr/downloaded_packages’
> 

Even using Rstudio this is permitted (it prompts you to restart, but you can select 'No' and installation proceeds).

Regardless, which package is installed in the example above is not important to the issue at hand.

achubaty commented 1 year ago

instead of removing the checkLibPaths() call altogether as in PR #95, perhaps checkLibPaths(libPaths = libPaths, exact = TRUE) could be used.

eliotmcintire commented 1 year ago

I like that better.

eliotmcintire commented 1 year ago

Have you decided not to do this @achubaty?

achubaty commented 1 year ago

I've been waiting for you to approve/merge the PR

eliotmcintire commented 3 months ago

Was completed when #95 was accepted