Closed MatthieuStigler closed 1 year ago
It's a good and fair question. For R, .libPaths()
order wins. And for example the Debian and Ubuntu default even without a path below $HOME
is to have /usr/local
before the path that r2u
installs too:
> .libPaths()
[1] "/usr/local/lib/R/site-library"
[2] "/usr/lib/R/site-library" # r2u installation
[3] "/usr/lib/R/library"
>
I have this situation on my laptop where a number of packages in /usr/local/
shadow the (potentially newer) ones from r2u
, and I have been meaning to at least write an extended version of available.packages()
to flag and warn about shadowed packages.
Ultimately, it is a local sysadmin question. You (for your machine) and I (for my laptop) decided to use a large number of (now system!!) package with r2u
so maybe on that machine we need to override the order in .libPaths()
in Rprofile.site
and/or our user ~/.Rprofile
. It is a pretty new problem to have thanks to r2u
.
ok, this is actually more complicated than I thought! I just installed on our server, and then users started having error messages like:
library(tidyverse) Error: package or namespace load failed for 'tidyverse' in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): namespace 'rlang' 1.0.2 is already loaded, but >= 1.0.6 is required
which comes because of the shadowing issue:
subset(installed.packages()|> as.data.frame(), Package=="rlang")[,1:5] Package LibPath Version Priority Depends rlang rlang /usr/local/lib/R/site-library 1.0.2
R (>= 3.4.0) rlang.1 rlang /usr/lib/R/site-library 1.0.6 R (>= 3.4.0)
I tried doing echo "bspm::disable()" | sudo tee -a /etc/R/Rprofile.site
but users still had conflicts.
So at that point, I can think of two approaches:
.libPaths()
order, and remove the two first. This means though users cannot have their own github installed pacakges?Do you have a sense on which one makes more sense?
Also, two side questions if I may:
/usr/local/lib/R/site-library
? remove.packages()
? Strangely enough, one user was not able to remove a pkg from there, doing:
> remove.packages("rlang", lib = "/usr/local/lib/R/site-library")
> subset(installed.packages()|> as.data.frame(), Package=="rlang")[,1:5]
Package LibPath Version Priority Depends
rlang rlang /usr/local/lib/R/site-library 1.0.2 <NA> R (>= 3.4.0)
rlang.1 rlang /usr/lib/R/site-library 1.0.6 <NA> R (>= 3.4.0)
That is a somewhat different local issue / the same issue as we now have easier choice to have current packages later in the paths.
But hey if you have many users with a potpourri of R versions and installations and you cannot or do not want to deal with .libPaths()
resorting then maybe r2u is not for you. Or not until "we all" figure out how to sort, or to maybe add 'pinning' to library, or ... It's a new topic. It's good to brain storm.
As for removal, consider relying just on apt for system packages. I need to check with @enchufa2 what the best policy was, he had a good point on that too but I don't want to quote him without checking.
By a (decades long !!) convention on Linux systems with package managers. /usr/local/
is outside of apt
or dpkg
(and, I believe, other distros do the same). So you only get to /usr/local/lib/R/site-package
by calling R directly. Which is what I have done for 25+ years with R on Linux. I personally also do NOT use a library below ~ so for me that is the default. But it can now be behind where r2u
installs so we will work on some tooling. Maybe diagnostics first.
But recall that everything actually works as documented and expected. It is "merely" creating a new situation for us.
This means though users cannot have their own github installed pacakges?
False. Users still do whatever they want however they want. But by not paying attention thay can also keep a local (possibly outdated!) package ahead of a system-installed newer one.
But that is no different than ~/R/*/*
shadowing /usr/local
. We always had this problem as soon as we had several directories in .libPaths()
. In practice it was less of an issue because we had fewer reasons to install (many!!) packages later in the path. r2u
changes that. That is still a good thing but we will need to work out best practices.
Could you please share your advice on using simultaneously r2u and standard
install.packages()
commands withoutbspm
?
TL;DR, do not do that. :) Could you please share what is your goal here? Depending on this, I bet there are better ways to achieve it.
bspm
was designed to leave it always enabled. You want a GitHub package not available on CRAN? Call remotes
, and you'll get it in your user folder, no issue. You may have your user path (~/R/x86_64-pc-linux-gnu-library/4.2
in this case) full of non-CRAN packages, and everything else from r2u via bspm
, and everything just works. I've been using this (well, the Fedora version of this) in my own computer and several servers I manage for... 3 years now with no issues.
Issues arise when you start mixing old and new versions. But this is going to happen with or without r2u/bspm, because this is how R/CRAN works. And r2u/bspm is just a new better source of package installations, not a new way to manage packages. If you really need specific versions of packages for certain things, then maybe install.packages
is not for you, and you should resort to locking versions with something like renv
(which, BTW, changes .libPaths()
to operate properly).
Now, what if you don't really need old versions of packages? What if you are just trying to introduce r2u in a system that already has users with their user dirs full of packages? I'm of course guessing here. But if this is the case, then you need to 1) collect all the packages that your users have, 2) install them from r2u, 3) remove them from their user dirs, and 4) enjoy!
Does this mean I should look for all packages in either "~/R/x86_64-pc-linux-gnu-library/4.2" and "/usr/local/lib/R/site-library", check those installed by install.packages (i.e. exclude the github ones), remove them, then install again?
Exactly, this is what I was saying. You can just collect all the packages, call bspm::install_sys(pkgs)
, and whatever this function returns, they're non-CRAN packages (so do not remove them, remove the others).
Is
bspm
also affectingremove.packages()
?
No, see https://github.com/Enchufa2/bspm/issues/43#issuecomment-1177939260
thanks @Enchufa2 !
Ok I see what you mean, if I can scan all previous packages installed by all users on the server, then I can install them myself into /usr/lib/R/site-library
, then remove the packages in the user directory. I'll have to think about this, not sure yet how to remove on users' folders, might need to ask them all to do it or do it myself as sudo. 🤔
@MatthieuStigler I would start by writing a script that does this for the current user. Then you could ask your users to execute it, or you could sudo su - <user>
for every user and run it yourself.
I would be interested in providing a function and a deployment script in bspm to facilitate this task, so I've opened the issue above in the bspm repo. It would help a lot if you could share your experience/code/issues with this task over there.
Here is a (very, five-minute) first pass at a sketch to find 'shadowed' packages. It is more general than r2u or bspm -- it really applies everywhere where length(.libPaths()) > 1
is true.
shadowedPackages <- function() {
if (!requireNamespace("data.table", quietly=TRUE)) {
message("Please install data.table")
return(invisible())
}
require(data.table)
ip <- installed.packages()
d <- data.table(ip[,1:3])
d[, Version:=as.package_version(Version)]
d[,n:=.N,keyby=Package]
d[n>1, good:=Version==max(Version), by=Package][n>1,]
}
On my laptop which lives off r2u, I find a few packages shadowing the binaries, mostly one I have worked on myself.
> shadowedPackages()
Key: <Package>
Package LibPath Version n good
<char> <char> <package_version> <int> <lgcl>
1: Rcpp /usr/local/lib/R/site-library 1.0.9.1 2 FALSE
2: Rcpp /usr/lib/R/site-library 1.0.10 2 TRUE
3: RcppAPT /usr/local/lib/R/site-library 0.0.9 2 TRUE
4: RcppAPT /usr/lib/R/site-library 0.0.9 2 TRUE
5: bspm /usr/local/lib/R/site-library 0.4.0.1 2 FALSE
6: bspm /usr/lib/R/site-library 0.4.2 2 TRUE
7: dang /usr/local/lib/R/site-library 0.0.15 2 TRUE
8: dang /usr/lib/R/site-library 0.0.15 2 TRUE
9: data.table /usr/local/lib/R/site-library 1.14.7 2 TRUE
10: data.table /usr/lib/R/site-library 1.14.6 2 FALSE
11: littler /usr/local/lib/R/site-library 0.3.15.2 2 FALSE
12: littler /usr/lib/R/site-library 0.3.17 2 TRUE
13: tiledb /usr/local/lib/R/site-library 0.16.0.2 2 FALSE
14: tiledb /usr/lib/R/site-library 0.18.0 2 TRUE
>
It's prettier as a screenshot as I am such a fan of both colorout and the theme I use :)
The shadowPackages()
function is now in the GitHub repo of CRAN package dang. As the underlying issue was always more of generic R problem of how to align multiple directories with a .libPaths()
and therefore not all that specific to this repo, I am going to close it.
Big thank you for raising the issue though -- as it is now addressed in both bspm
(for real) and dang
(very lightly as shown above) with helper code,
Thanks a lot Dirk, this is very much appreciated!
Two quick points:
shadowedPackages()
right? You wrote above shadowPackages ;-)Also, for some reason, doing it the dplyr
way, I get one more shadowed package, not sure why?
library(dplyr, warn.conflicts = FALSE)
shd <- dang::shadowedPackages() %>% as_tibble()
head(shd)
#> # A tibble: 6 × 4
#> Package LibPath Version Latest
#> <chr> <chr> <pckg_vrs> <lgl>
#> 1 bspm /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 0.4.2.1 TRUE
#> 2 bspm /usr/local/lib/R/site-library 0.4.2 FALSE
#> 3 dang /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 0.0.15 TRUE
#> 4 dang /usr/lib/R/site-library 0.0.15 TRUE
#> 5 effects /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 4.2.3 TRUE
#> 6 effects /usr/lib/R/site-library 4.2.2 FALSE
ins <- installed.packages()|> as.data.frame() %>% as_tibble()
ins %>%
add_count(Package) %>%
filter(n>1) %>%
arrange(Package) %>%
select(Package, LibPath, Version)
#> # A tibble: 8 × 3
#> Package LibPath Version
#> <chr> <chr> <chr>
#> 1 bspm /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 0.4.2.1
#> 2 bspm /usr/local/lib/R/site-library 0.4.2
#> 3 dang /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 0.0.15
#> 4 dang /usr/lib/R/site-library 0.0.15
#> 5 effects /home/mstigler/R/x86_64-pc-linux-gnu-library/4.2 4.2-3
#> 6 effects /usr/lib/R/site-library 4.2-2
#> 7 tsDyn /usr/local/lib/R/site-library 11.0.2
#> 8 tsDyn /usr/lib/R/site-library 11.0.4
Created on 2023-02-13 with reprex v2.0.2
also, is there any chance that you increment the github package version? I was using the function together with bspm::moveto_sys
and following lines will fail as bspm::moveto_sys
is going to remove it:
Thanks!
dang::shadowedPackages()
bspm::moveto_sys() # will remove dang as has same version as CRAN
dang::shadowedPackages()
Yes I generally roll the minor version (and should) and yes I meant shadowedPackages()
.
I would need to see your installed.packages()
three columns to see about the missing package. Also, see inside the short function and maybe for kicks flip what is commented out with what is still there so try the data.table
variant. The base R one was a Sunday afternoon 'Code Golf' exercise with @vincentarelbundock. Lastly, your dplyr
variant needs a mutate to add which package is the 'max' version package.
@MatthieuStigler :
I would need to see your installed.packages() three columns to see about the missing package.
I never heard back from you. Anyway, shadowed.packages()
is back to data.table()
, and I rolled the minor version as usual. Feedback still welcome.
@eddelbuettel sorry about that! I actually re-ran the function, and the package then appeared! So wasn't an issue after all, the function seems to be working well!
To summarize the post and for future discoverability, would it be fair to say that there are at least two potential solutions to shadowed packages:
/usr/lib/R/site-library
, using bspm::moveto_sys()
Thanks!
Sigh. You could have told me...
I am not so syre there is a generic or general solution to your "problem" we can or should "prescribe". .libPaths()
has several entries, both install.packages()
and library()
allow you to set directories (that is likely how renv
and packrat
and groundhog
and whatnot work) -- so all of this is an R feature. You as local sys admin should devise a policy.
shadowedPackages()
allows you to identify packages that are shadowed. bspm
added several tools to help with and automate migration for users or system wide. How to deploy them will likely depend on your circumstances.
good, thanks for the summary!
And sorry again for not letting you know about that. To redeem myself, I did some checks, and actually it seems there is an issue with data table with the latest version?
devtools::install_github("eddelbuettel/dang")
#> Skipping install of 'dang' from a github remote, the SHA1 (d391ca48) has not changed since last install.
#> Use `force = TRUE` to force installation
packageVersion("dang")
#> [1] '0.0.15.1'
library(dang)
shd <- dang::shadowedPackages()
#> Loading required package: data.table
#>
#> Attaching package: 'data.table'
#> The following objects are masked from 'package:dang':
#>
#> as.data.table, wday
#> Error in `:=`(Version, as.package_version(Version)): Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").
head(shd)
#> Error in head(shd): object 'shd' not found
Created on 2023-02-14 with reprex v2.0.2
Please try now, as I am not forcing data.table
in it needed a .datatable.aware <- TRUE
to ensure [
dispatches right.
it's working now!
Dirk thanks for this great package, works very well!
Could you please share your advice on using simultaneously r2u and standard
install.packages()
commands withoutbspm
? Currently I have both, and as"~/R/x86_64-pc-linux-gnu-library/4.2"
is first in.libPaths()
is first, R might uses older versions in"~/R/x86_64-pc-linux-gnu-library/4.2"
rather than newer installed byr2u
.So my questions are:
install.packages()
withoutbspm
andr2u
, do you recommend changing `.libPaths()?bspm
, I guess one will need to remove all packages in"~/R/x86_64-pc-linux-gnu-library/4.2"
and even in "/usr/local/lib/R/site-library", as older versions there would take precedence over r2u packages? Is my understanding correct that bpsm will haveinstall.packages()
in/usr/lib/R/site-library
, butremove.packages("xxx")
under bpsm is left unaffected and will remove from the first repo? Does this mean I should look for all packages in either"~/R/x86_64-pc-linux-gnu-library/4.2"
and "/usr/local/lib/R/site-library", check those installed byinstall.packages
(i.e. exclude the github ones), remove them, then install again?Thanks!!