AntoineSoetewey / statsandr

A blog on statistics and R aiming at helping academics and professionals working with data to grasp important concepts in statistics and to apply them in R. See www.statsandr.com
http://statsandr.com/
35 stars 15 forks source link

blog/an-efficient-way-to-install-and-load-r-packages/ #20

Closed utterances-bot closed 3 years ago

utterances-bot commented 3 years ago

An efficient way to install and load R packages - Stats and R

What are R packages and how to use them? Discover also a more efficient way to install and load R packages in R thanks to the pacman and librarian packages

https://statsandr.com/blog/an-efficient-way-to-install-and-load-r-packages/

AntoineSoetewey commented 3 years ago

Comment written by Gillberke on February 01, 2020 10:04:23:

It is not possible to use %>% invisible() before loading packages I guess?

AntoineSoetewey commented 3 years ago

Comment written by Gillberke on February 01, 2020 10:04:23:

It is not possible to use invisible() before loading packages I guess?

Comment written by Antoine Soetewey on February 01, 2020 15:50:28:

Thanks for your comment Gillberke.

Yes it is possible to put the invisible function before, like this:

invisible(lapply(packages, library, character.only = TRUE)).

And I actually believe it is even better to put it that way because the pipe %>% requires the {dplyr} package, a package which is not loaded yet.

I edited the article by including your comment. Thanks again!

AntoineSoetewey commented 3 years ago

Comment written by jameshunterbr on February 02, 2020 21:42:18:

Like pacman, the librarian package will try to install packages not yet installed in its shelf function (equivalent of library). It also searches in bioconductor and github for the package to load.

AntoineSoetewey commented 3 years ago

Comment written by jameshunterbr on February 02, 2020 21:42:18:

Like pacman, the librarian package will try to install packages not yet installed in its shelf function (equivalent of library). It also searches in bioconductor and github for the package to load.

Comment written by Antoine Soetewey on February 03, 2020 08:07:13:

Thanks for your comment James, I have edited the article according to it!

AntoineSoetewey commented 3 years ago

Comment written by Philip Gundy on February 04, 2020 19:39:38:

I actually have been using something very similar, however instead of invisible I use the below code and I also reverse the package order list to correctly have packages mask functions in my preferred order. I find the way this ends up printing into the console is quite helpful.

as.data.frame(lapply(pkgs, require, character.only=TRUE))

AntoineSoetewey commented 3 years ago

Comment written by Philip Gundy on February 04, 2020 19:39:38:

I actually have been using something very similar, however instead of invisible I use the below code and I also reverse the package order list to correctly have packages mask functions in my preferred order. I find the way this ends up printing into the console is quite helpful.

as.data.frame(lapply(pkgs, require, character.only=TRUE))

Comment written by Antoine Soetewey on February 05, 2020 09:08:51:

Thanks for your comment Philip! Many users will find your code useful too, depending on their preferences :)

AntoineSoetewey commented 3 years ago

Comment written by Alan Haynes on February 05, 2020 14:06:37:

You could also check out renv (its essentially a replacement for pacman as far as I understand it). It'll sift through all of your code looking various ways of loading/using a package and make a lockfile containing details of package versions and, where necessary, a local package library.

Makes it easy to revert changes if you update a package and break something... (it doesn't, however, play nicely with your "more efficient" way of loading packages, it only recognizes methods such as library(x), x::, x:::, etc). You don't need to then include install.packages in the script, just do it manually once and then renv will pull it if its actually needed for the analysis.

AntoineSoetewey commented 3 years ago

Comment written by Alan Haynes on February 05, 2020 14:06:37:

You could also check out renv (its essentially a replacement for pacman as far as I understand it). It'll sift through all of your code looking various ways of loading/using a package and make a lockfile containing details of package versions and, where necessary, a local package library.

Makes it easy to revert changes if you update a package and break something... (it doesn't, however, play nicely with your "more efficient" way of loading packages, it only recognizes methods such as library(x), x::, x:::, etc). You don't need to then include install.packages in the script, just do it manually once and then renv will pull it if its actually needed for the analysis.

Comment written by Antoine Soetewey on February 06, 2020 15:22:05:

Thanks Alan for your comment, I'll definitely check it out!

AntoineSoetewey commented 3 years ago

Comment written by Matt.0 on February 11, 2020 21:35:26:

{renv} is a good suggestion for managing environments (replacement for {packrat}) put out by RStudio team (https://github.com/rstudio/renv) but already mentioned by Alan Haynes.

If you're just looking for a more efficient way to install packages then the R-lib team has a package called {pak} which (I believe) is a replacement for {pacman}. It installs R packages from multiple sources CRAN/Bioconductor/Github (https://github.com/r-lib/pak)) which is convenient and claims to be faster and safer

758cd4bf8b362162d8c81854276193032181f1a2640ac6818313bdeaec087994

AntoineSoetewey commented 3 years ago

Comment written by Matt.0 on February 11, 2020 21:35:26:

{renv} is a good suggestion for managing environments (replacement for {packrat}) put out by RStudio team (https://github.com/rstudio/renv) but already mentioned by Alan Haynes.

If you're just looking for a more efficient way to install packages then the R-lib team has a package called {pak} which (I believe) is a replacement for {pacman}. It installs R packages from multiple sources CRAN/Bioconductor/Github (https://github.com/r-lib/pak)) which is convenient and claims to be faster and safer

758cd4bf8b362162d8c81854276193032181f1a2640ac6818313bdeaec087994

Comment written by Antoine Soetewey on February 11, 2020 22:48:20:

Thanks Matt for your input. As you are the second person to recommend the {renv} package I'll make sure to check it out as soon as I find some time!

AntoineSoetewey commented 3 years ago

Comment written by Aaron D. Cherniak on October 11, 2020 14:31:41:

How do I know that librarian was set? Nothing appears in the console when I run the line.

AntoineSoetewey commented 3 years ago

Comment written by Aaron D. Cherniak on October 11, 2020 14:31:41:

How do I know that librarian was set? Nothing appears in the console when I run the line.

Comment written by Antoine Soetewey on October 11, 2020 16:22:37:

Dear Aaron,

The best way to check that everything is set is by simply running a function from the package you wanted to install and load. So for example, if you wanted to install and load the {ggplot2} package, you can run:

install.packages("librarian") librarian::shelf(ggplot2)

And then try to create a plot with the ggplot() function. If everything works fine, you are all set. Otherwise, it means something is missing/broken. In that case, try to reinstall the {librarian} package and/or try with another package in the shelf() function.

Hope this helps.

Regards,
Antoine

skanskan commented 3 years ago

Hello.

How can I load the package only if it's not already loaded?

This can be useful for example when running script files, changing something and running it again... In order to save some time you can load all needed packages before, and later just skip them because they are already loaded.

AntoineSoetewey commented 3 years ago

Hello.

How can I load the package only if it's not already loaded?

This can be useful for example when running script files, changing something and running it again... In order to save some time you can load all needed packages before, and later just skip them because they are already loaded.

Hello,

The code in this section and the p_load() function from the {pacman} checks to see if a package is installed, if not it attempts to install the package and then loads it.

Unfortunately, I am not aware of any function or code which also checks that a package is already loaded before loading it.

However, from my personal experience (which may be different than yours depending on the number of packages you use), I see that installing packages takes time, but loading them (even if they have already been loaded) does not take much time.

Does that take a long time on your side? If yes, perhaps a workaround would be to edit my code so that it only loads the one not already loaded. But I don't have any quick/easy solution.

Hope this helps.

Regards, Antoine

skanskan commented 3 years ago

Maybe the way is using a setdiff with the list of needed packages and the output of (.packages()) Would it be a proper way to do it?

AntoineSoetewey commented 3 years ago

Maybe the way is using a setdiff with the list of needed packages and the output of (.packages())

Would it be a proper way to do it?

Good question.

I'm not an expert in packages management so I cannot answer your question.

If you feel it, feel free to give it a try and let me know if it works!

Regards, Antoine

Ashish-Soni08 commented 3 years ago

Most efficient way {pacman} package

After this article was published, a reader informed me about the {packman} package <- ### minor edit here: packman -> pacman

AntoineSoetewey commented 3 years ago

Most efficient way {pacman} package

After this article was published, a reader informed me about the {packman} package <- ### minor edit here: packman -> pacman

Corrected.

Thank you Ashish!

skanskan commented 3 years ago

Most efficient way {pacman} package

After this article was published, a reader informed me about the {packman} package <- ### minor edit here: packman -> pacman

In fact I think it was me :)

wiesehahn commented 3 years ago

Thanks for this post and the comments, I wanted to set up a header template as RStudio snippet (like this) and implement a good way to (install) load packages. I will for sure include renv, with renv::restore() packages are installed but not loaded. I would very much appreciate an article update for best practices with renv.

AntoineSoetewey commented 3 years ago

Thanks for this post and the comments, I wanted to set up a header template as RStudio snippet (like this) and implement a good way to (install) load packages. I will for sure include renv, with renv::restore() packages are installed but not loaded. I would very much appreciate an article update for best practices with renv.

Dear Jens,

Thanks for your comment.

Unfortunately I am not familiar with restore() from the renv package, so I am not the right person to give you the best practices regarding this function.

Good luck in your research!

Regards, Antoine