maurolepore / pkg6w

R Packages: What, why, who, where, when and how?
GNU General Public License v3.0
0 stars 0 forks source link

The Six W's About R packages

Many R users struggle with R packages. The problem is largely because the useful information is scattered or burried in text written mainly for developers, the fairly experienced fellows who create R packages (e.g. the R Packages book by Hadley Wickham, and these slides by Jenny Bryan). This document helps R users to understand and use R packages by briefly answering the Six W's about R packages, targetting specifically an imaginary R user with no background.

What Is An R Package?

An R package is a kind of Inflatable Castle: A castle that inflates. An Inflatable Castle can be big and red, but it can't have a massive hole -- the hole won't let it inflate, so if your castle has a massive hole it is no longer an Inflatable Castle. (Sad, I know). Similarly, an R package is a very specific collection of files and folders. To meet the definition of "R package", the names and structure of those files and folders have to meet some expectations. If you doon't meet them, you have files and folders but not an R package.

Source Versus Built

Can you have a deflated Inflatable Castle? Yes you can. The deflated Inflatable Castle isn't good for jumping on, but it is good for other things, like cleaning it. Similarly, to clean or maintain an R package you need its "deflated" or source state. The source of a package has all you need to build it but you can't use it for its main purpose until you actually build it. This built, "inflated", state of an R package is what you use to have fun.

The Useful Stuff In An R Package

(What is each of these items? Link to examples)

Why Is An R Package Useful?

An R package is useful because it makes it easy to share and use the functions (computer programs) and/or the data that it contains, and the documents that explain how to use those functions (See What Is An R Package > The Useful Stuff In An R Package xxx_add_link). An R package makes all its components available from one place: The R console. Thus R packages help you to find what you need quickly. Thay way you can spend most of your time working on whatever you want to accomplish rather than on the mundane task of searching for auxiliary information scattered in your computer or online. Having everything you need structured as an R package is convenient.

Who Creates And Uses An R Package?

Anyone can create an R package. Simply put, you create a package if you put it together. Generally that means that you would also fix any issues and update the package. For a formal definition of the different roles you can take in relation an R package see the help file of the function person() (click th link or run this the R console: ?person).

Also anyone can use an R package; or more precisely, everyone must use R packages. Why? Because almost everything you do in R uses code that lives in an R package. That is true for even the simplest computations: For example, if you run in the R console 1 + 2 you are using the base package. Confirm this by running in the R console ?Arithmetic or ?+`` (or click here). That help file refers to the base package in two places: at the very top it says "Arithmetic {base}"; and at the very bottom it says "[Package base version ... Index]". The package base -- and a few others -- come with R. With those default packages you can do a lot of things but often you may want to use other packages to extend the basic functionality of R.

Where does an R package live?

An R package can live locally in your computer, or remotely in a number of websites such as CRAN [GitHub] and (https://github.com/). More specifically, an R package always lives inside a folder -- the root directory -- that contains all its components.

Notice that we commonly use the word "repo" to refer to a folder in an onilne repository (with or without code). So if someones says "go to the repo of package X" they are directing you to the website where you can see the folder containing the package X.

When Is An R Packages Created, Shared, Used, And Updated?

When Are R Packages Created

You create an R package in your computer whenever you combine its fundamental components (which are surprisingly few). Simply, that is when you run devtools::create("name-of-your-package"), or when you click these menues in RStudio: File > New Project ... > New Directory > R Package.

When Are R Packages Shared

You share an R package whenever you allow at least one more person to access it. For example:

Notice that you can share an R package in any state of development. How can your colleague know if the code is still in draft stage or it is polished and safe to use? The stable snapshots (versions) of your code you can flag with a label Release. When you publish on CRAN a snapshot of your code, that snapshot is associated to a version number that represents that particular release. But CRAN is not the only platform that allows that. You can also release snapshots of your code even if it lives only on more informal repositories. For example, here are the instructions for Creating Releases on GitHub.

When Are R Packages Used

When you start a new R session you may need to set things up by installing and loding (opening) some packages. The steps you follow to use an R package or any other software are alike. If you want to use Chrome your neet to first install Chrome, once, and then open Chrome, every time you want to use it. Similarly, if you want to use the R package pkg you need to first install pkg, once, and then load it every time you want to use it.

That is always the case except for a few packages that R loads automatically for you every time you start a session. You can see what those packages are with print(.packages()). You will notice that the package base is listed, so you can directly use its objects (functions and data) -- for example sum(). If these packages aren't enough you need to tell R that you want to use additional packages.

To tell R that you need objects from an additional package you can use two ways. Suppose you need the function fun() from the package pkg; these are your options:

  1. Tell R once by running library(pkg) and then use fun() directly.
  2. Tell R every time you need fun() that it belongs to the package pkg by using the sintax pkg::fun(). The first approach saves space and effort; the second one expresses provenance more clearly and avoids conflict with objects named identically but from other packages. Thus the second approach is safest but the first one is often enough. Your best choice depends on the situation.

When Are R Packages Updated

Alive software is dynamic and R packages are no exception. While users can access a stable -- released -- version of an R package, developers may continue to change a copy of it (branch) in preparation for the next release. There is no such a thing as a finished R package; the closest to that is a dead one.

Occationally you update an R package and realize that it changed in a way that you dislike. Are you stuck with the new version? Not at all. The packages devtools and remotes allow you to install older versions from CRAN, GitHub, and other repositories with install_version() and also with install_REPOSITORY() using the sintax install_REPOSITORY("USER-OR-ORGANIZATION/PACKAGE@VERSION"), for example: install_github("r-lib/httr@v0.4").

How To?

How To Find An R Package?

Finding good R packages is like finding good movies or books. These are some things you may try:

How To Judge An R Package?

xxx consider these links:

https://simplystatistics.org/2015/11/06/how-i-decide-when-to-trust-an-r-package/

http://www.masalmon.eu/2017/12/11/goodrpackages/

To predict how much pleasure or pain an R package will give you, consider these questions:

How To Install An R Package

xxx adapt blog post in fgeo about Installing Packages from GitHub. Extend it to include this subsection and also to cover uninstalling a package, checking what packages are installed and loaded, and how to use a package.

How To Uninstall An R Package

You can uninstall an package with remove.packages(); that is regardless of where you installed it from (CRAN, GitHub or other repository). If remove.packages() fails, you can apply brute force and remove the package manually by deleting the directory where R stores it. You can find that directory with .libPaths().

How To Explore Installed and Loaded Packages

You can see what packages you have installed with installed.packages(), and you can see what packages you have loaded ("opened") with .packages(). .packages() returns invisibly, meaning that to see the output on the console you need to explicitely print() it or assign it to an object and then print the object as you would normally do.

# Shows nothing because returns invisibly
.packages()

# Shows what you want
print(.packages())

# Same
x <- .packages()
x

You can also see the loaded packages with search(), but the output will not be exclusively packages (see ?search()).

search()

identical(search(), .packages())

How To Learn About An R Package

How To View The Functions Index

Once you have installed a new package you may want to know what you can do with it. A good place to start if by looking at the functions' index with help(package = "package-you-want-to-know-about"). You can also list all the objects of a package with the autocompletion feature of RStudio -- type package:: and press the tab key.

![](https://i.imgur.com/m1iWDhM.png)

How To Browse Vignettes

If an R package contains a vignette, you can find it with browseVignettes("tidyverse"). Also, vignette() lists vignettes of all packages with vignettes, and vignette(all = FALSE) does the same but only for the packages you have attached (those that come by default with R or that you "opened" with library(name-of-a-package)).

How To Find Help Files

Thesa are three ways by which you can access the help file of a specific function:

  1. With ?name-of-the-function, or, ?name-of-the-function(), or `help(name-of-the-function"). The package has to be loaded where the function comes from; for example:
library(tidyverse)
?tidyverse_update()
  1. From RStudio, step on the name of a function then press F1 on your keyboard.

  2. From your web browser, searh a function on Google or at https://www.rdocumentation.org/.

How To Find The Source Code Of An R Package

Occationally you may need the source code of an R package. That is the case, for example, when you want to better understand how a function works. Here are some ways to access the source code.

If you have already installed the package, you can run the bare name of specific functions on the R console. For example, here is the source code of the function enlist() from the package MASS.

library(MASS)
enlist

To explore online the source code of most packages you can use GitHub. All packages published on CRAN have a mirror at https://github.com/cran/. And if the package is hosted in a public repository on GitHub you can find its source at an address with the format https://github.com/user-or-organization-name/package-name. The functions always like in the directory R/ (consider using the powerful search options of GitHub).

How to contribute

xxx

How To Report Bugs

How To Ask For Help

Summary

Workflow

xxx Compose a paragraph with the key content of each of the six w's.

Resources