Closed pascalgulikers closed 2 months ago
The goal of the current process is reproducibility. A
lockfile_create()
lockfile_install()
sequence should give you the same packages on every machine, irrespectively of what was already installed before.
lockfile_install()
checks that the packages that are already installed satisfy the requirements specified in the lockfile. E.g. if the lockfile says to install foobar version x.y.z from CRAN, and foobar version x.y.z is already installed from CRAN, then it will not reinstall it.
If we just accepted whatever version of the package is already installed, then different computers would give you different versions of packages for the same DESCRIPTION
file.
Understood, but if you use:
lockfile_create(lib = .libPaths()[1])
lockfile_install(lib = .libPaths()[1])
or just lockfile_install()
since the default value of lib
is already .libPaths()[1]
,
the source (e.g. CRAN) is not being checked. So those 2 functions are inconsistent.
If you use
lockfile_create(lib = .libPaths()[1])
and package foobar is required, and it is already installed (a version that satisfies the requirements coming from DESCRIPTION
), then that'll go into the lockfile, so the expected source is not CRAN, but an installed package.
Thinking about this more, I don't mind exposing this argument on GHA, so I can reopen https://github.com/r-lib/actions/issues/814
But I am quite confident that the current defaults for lockfile_create()
make sense, so I am going to close this issue.
https://github.com/r-lib/pak/blob/67e6347f29619ed631a69d5adda6370d22f120dc/R/lockfile.R#L24
Since
lockfile_create()
detects already installed packages inlib
and in .Library (base and recommended packages), IMO it should have the same default value forlib
aslockfile_install()
for consistency. The default value for lockfile_create() isNULL
and the default value forlockfile_install()
is.libPaths()[1]
which creates unexpected results, i.e.:lockfile_create()
without alib
-parameter value (because defaultNULL
) won't detect already installed site-packages (in.libPaths()[1]
), resulting in a lockfile (pkg.lock
) with upstream sources instead ofinstalled
property, for example:instead of:
lockfile_install()
, which has a defaultlib
-parameter value of.libPaths()[1]
, is also supposed to detect already installed packages, this time in default.libPaths()[1]
(usually this is the same asSys.getenv("R_LIBS_SITE")
). But if those installed packages have different source metadata, they will be checked against pkgcache metadata (in~/.cache
). If found in the cache (e.g. ✔ Cached copy of yaml 2.3.8 (source) is the latest build) they will not be reinstalled. If not found there (could be due to another user with another homedir (~) but with the same site-library, they will be reinstalled nevertheless, even if the installed and remote package have the same version and architecture.PROPOSED SOLUTION: Having the
lib
parameter value oflockfile_create()
the same as for thelockfile_install()
function e.g..libPaths()[1]
will solve this as they will be marked as 'already installed' in the generated lockfile (same as is the case now for base and recommended packages). Thus an early detection of installed available packages.What's the purpose of a
lib = NULL
value forlockfile_create()
anyway? Since site-library installed packages should always be considered for a dependency plan (imho). And why would one not want to detect them inlockfile_create()
but try to detect them inlockfile_install()
with a fallback to pkgcache? It's not consistent and produces unexpected results when packages were installed before from a different source or under a different user (in pulled base images for example).