r-lib / pak

A fresh approach to package installation
https://pak.r-lib.org
677 stars 63 forks source link

offline use of pak ? #127

Open cderv opened 5 years ago

cderv commented 5 years ago

is pak suppose to only work in online environment ?

In our RStudio Server Pro cluster, we are in offline environment and packages can be installed because we are connected an internal CRAN mirror.

However, pak seems to use other source of information for medatada and it throws an error

> pak::pkg_install("glue")
ℹ Checking for package metadata updates
✖ Metadata download failed                                                                                
Erreur : Failed to connect to 2606:4700:30::681b:bf5c: Le réseau n'est pas accessible     

After some search inside code source, I believe this is because by default pkgcache use bioc = TRUE and use bioc url I can't reached.

I know that is because bioc repos are supported out of the box by pak, but I did not found a way to deactivate the use of bioc repos.

So at the end,

Otherwise, pak won't be able to be used on offline environment environment, which is common in some company.

I'll try for now to see if creating a pkgcache medatada, without bioc before using pak, is possible and does the trick

Thanks.

cderv commented 4 years ago

As we tested the product, I wonder if there is any plans to support offline use of {pak} to work with RStudio Package Manager repository ? We are now using the product in our offline environment and publishing bioc package in a cran like repo as RSPM does not provide a bioc-like url to configure pkgcache with. Currently, it does not seem to work and I wonder if it will.

I did not manage to easily isolate where it hangs exactly. I still believe there is at least pkgcache metadata that sync on the default cran repos (offline ok) but also on bioconductor by default. I did not found how to deactivate the use of Bioc (maybe it cannot work without it ?) for pkgcache.

I think pak management is interesting for r-admin to manage shared libraries on servers. Also, users can't use pak on our RStudio cluster for now.

jimhester commented 4 years ago

Can you set R_BIOC_MIRROR envvar or BioC_mirror options to get around this?

cderv commented 4 years ago

I have seen those but to which values should I set it ? Do I need a bioc mirror ? or a cran-like repos from RSPM with bioc package in it seems ok to you ? I will try the latter to see if it works but I understood bioconductor repository have a special beahavior.

cderv commented 4 years ago

No it does not work.

one observation during this test: R_BIOC_MIRROR is not used everywhere because when bioc = TRUE (the default) in pkgcache, it tries to guess R_BIOC_VERSION by reading a config file. But the url of the config file does not seem to take into account the R_BIOC_MIRROR as config_url is hardcoded to https://bioconductor.org/config.yaml then used in get_config_yml

If I apply a workaround by using set_config_yaml and providing manually the https://bioconductor.org/config.yaml content, it helps creates the different Bioc repos url BioCsoft, BioCann, BioCexp and BioCworkflows. Those endpoints are specific to Bioc repo, so i don't think this will work as RSPM does not support mirroring Bioc repos

Does RStudio Package Manager support BioConductor? Today, RStudio Package Manager does not have direct support for BioConductor. However, BioConductor package tar files can be added manually or admins can configure RStudio Package Manager to track specific BioConductor Git endpoints.

source: RSPM admin guide

That is why we use a cran-like repos to manually upload required bioc packages.

One way for pak to work I think would be to offer a bioc deactivation for pkgcache. Offline, this works for me:

library(pkgcache)
res <- cranlike_metadata_cache$new(bioc = FALSE)
res$list()

So I guess if this is the only thing that prevent pak to use offline, if we can pass bioc = FALSE through an option or env var to pkgdepends that uses pkgcache, this could work. Default is TRUE in pkgcache, and default is used by pkgdepends

jimhester commented 4 years ago

Having it controlled in pkgdepends with an envvar or via the config makes sense to me