This is a refresher and/or crash course on the essentials of building packages
for the R language and programming environment,
complete with the documentation necessary for distributing these packages on
GitHub and CRAN.
Rather than aiming to build a perfect R package, this tutorial aims to provide
only the minimal details necessary for building a functional package. The main
goal is to create a repository for custom functions, complete with necessary
documentation to make the package useful to others, and to publish the package
on CRAN. This tutorial should result in a package
that is a collection of custom functions that can be relied on to save you time
and improve the reproducibility of your data analytic endeavors.
Why Package R Code?
Just like many other programming languages for scientific computing, R makes
use of a modular system for distributing code, which makes the creation and
maintenance of specialized R code both organized and manageable.
In R, a variety of especially useful tools have been developed to ease the
transition from writing custom functions to building packages consisting of
customized R code (and distributing the resulting packages).
Custom R code allows for both the fundamental capabilities and idiosyncratic
behavior of R to be modified, and a package makes this code both more easily
accessible to you (and others who may find your custom code useful).
Step I: Necessary Tools and Dependencies
In R, install the packages devtools and roxygen2, like so:
install.packages(c("devtools", "roxygen2"))
The package roxygen2 is necessary for generating proper documentation for
the package manual.
The package devtools provides numerous utilities that make building packages
significantly easier, including devtools::document(), devtools::build(),
and devtools::check(), to name but a few.
Step II: Building the Package Repository
Navigate to the parent directory of the package you would like to create
(using cd DIR or setwd("DIR").
Build the skeleton for the new package -- generating a new directory in the
process -- using the R command devtools::create("MYPKG").
The above will generate a directory with: (1) A subdirectory R/ for R
code, (2) A file NAMESPACE which will (later) be populated with function
and requirement names, (3) A file DESCRIPTION for required package
meta-data, and (4) An RStudio project file MYPKG.Rproj.
Next, navigate into the package directory and set it up as a Git repository
using git init (note: this is not strictly necessary but is a good
practice for any project).
Make sure to regularly use Git version control with the repository
contents (git add, git commit, git push) along with GitHub, as the
latter will provide public access to the package revision history via
GitHub's site.
Step III: Custom Functions and Documentation
Following one of several style guides, set up custom functions in several
distinct .R files in the R/ subdirectory of the package; best practices
involve thematically organizing functions into distinct .R files.
In particular, I recommend following the stylistic advice in Hadley Wickham's
comprehensive book R packages (this book
also contains a wealth of other information and tips for writing R packages).
Note that documentation must be added in front of each defined function in
.R files, using the required format for the roxygen2 package; this saves
time in the long run by allowing auto-generation of manual pages.
Step IV: Unit Testing with testthat + devtools
Formal automated testing of code is an important step in ensuring that work
is reproducible - specifically, unit testing ensures that code contains fewer
bugs, is more robust, and is structured more clearly.
To begin writing unit tests for a package, in the package directory, run
devtools::use_testthat(). This will create a subdirectory
tests/testthat to store individual unit tests for each function, as well as
a file tests/testthat.R to perform all tests when running R CMD check.
Write individual test files for each function in the package, with multiple
test_that statements checking various use cases. For advice on
organizing/writing tests, see the this helpful chapter by Hadley
Wickham.
After writing appropriate tests for each function in the tests/testthat
subdirectory, ensure that all tests are working by using devtools::test().
Repeat the above step as necessary to remove any problems brought to light in
the testing process. Once devtools::test() runs successfully without
catching any errors, move on to the final steps of building and releasing the
package.
Step V: Documentation and Building the Package
Once all desired custom functions, and proper comments for documentation,
have been set up in the .R files in the R/ subdirectory, use
devtools::document() to generate package documentation and manual.
The use of devtools::document() will generate (1) A subdirectory man/ for
manual pages (.Rd files), and (2) a number of .Rd files (one for each,
function), all of which may be found in the man/ subdirectory.
After the documentation has been properly generated, the package can now be
built and tested: in R, use devtools::build() while in the main package
directory; this will produce a zipped version of the package in the parent
directory (this can also be done from the command line with R CMD build MYPKG).
To ensure that the package is working appropriately, use either (1) R CMD check MYPKG.tar.gz on the built version of the package; or (2) while in the
package directory, from R, run devtools::check().
Ensure that the package is working as intended by resolving all issues marked
as WARNING or ERROR in the results produced by running the check.
Step VI: Publish the Package to GitHub and CRAN
Assuming that Git was used with the repository, the package will be available
from GitHub, and may be installed using
devtools::install_github("USER/REPO") within R.
Submit the package to CRAN (this
can also be done with devtools::submit_cran() in R); after it is accepted,
the package will be available for download with install.packages("MYPKG").
Useful Commands for Building/Publishing R Packages
devtools::create("MYPKG") - generates a package skeleton as described above.
devtools::document() - generates package documentation using the roxygen2
style comments preceding each function in the various .R files.
devtools::use_build_ignore("FILES") - adds named files to .Rbuildignore
with proper syntax. This is necessary for files not approved by CRAN.
devtools::use_testthat() - adds a subdirectory tests/testthat for writing
individual tests and a file tests/testthat.R to run all tests when R CMD check is used.
devtools::use_travis() - adds a .travis.yml config file to the repository
to be used with Travis CI.
devtools::test() - runs all of the available tests that are present in the
subdirectory tests/testthat to ensure that any functions with tests are
working as intended.
devtools::check() - builds the package and performs necessary checks to
ensure that everything is running smoothly (or points out errors). This is a
bit more thorough than R CMD check.
devtools::build() - generates the package manual and compiles other
necessary aspects, ultimately resulting in a zipped (.tar.gz) package file.
devtools::build_win() - builds and submits the package to CRAN win-builder
for checking, with a status report generated roughly 20 minutes later. This
conveniently checks against r-devel.
devtools::release() - builds the package, performs R CMD check, asks
various questions, then uploads the bundle to CRAN. _Preferable to
devtools::submit_cran() since this is more thorough_.
devtools::submit_cran() - builds and submits the package to CRAN, avoiding
the (annoying) interface.
R CMD build MYPKG - (from the command line) builds the package when run in
the parent directory, generating a zipped (.tar.gz) package file.
R CMD check MYPKG.tar.gz - (from the command line) runs necessary checks on
a built package, pointing out any warnings and errors that need correction.
R CMD check --as-cran MYPKG - (from the command line) runs checks similar to
the above but with additional requirements specific to CRAN that are necessary
for successful submission.
Last Updated: 26 June 2016
This is a refresher and/or crash course on the essentials of building packages for the R language and programming environment, complete with the documentation necessary for distributing these packages on GitHub and CRAN.
Rather than aiming to build a perfect R package, this tutorial aims to provide only the minimal details necessary for building a functional package. The main goal is to create a repository for custom functions, complete with necessary documentation to make the package useful to others, and to publish the package on CRAN. This tutorial should result in a package that is a collection of custom functions that can be relied on to save you time and improve the reproducibility of your data analytic endeavors.
Why Package R Code?
Just like many other programming languages for scientific computing, R makes use of a modular system for distributing code, which makes the creation and maintenance of specialized R code both organized and manageable.
In R, a variety of especially useful tools have been developed to ease the transition from writing custom functions to building packages consisting of customized R code (and distributing the resulting packages).
Custom R code allows for both the fundamental capabilities and idiosyncratic behavior of R to be modified, and a package makes this code both more easily accessible to you (and others who may find your custom code useful).
Step I: Necessary Tools and Dependencies
In R, install the packages
devtools
androxygen2
, like so:The package
roxygen2
is necessary for generating proper documentation for the package manual.The package
devtools
provides numerous utilities that make building packages significantly easier, includingdevtools::document()
,devtools::build()
, anddevtools::check()
, to name but a few.Step II: Building the Package Repository
Navigate to the parent directory of the package you would like to create (using
cd DIR
orsetwd("DIR")
.Build the skeleton for the new package -- generating a new directory in the process -- using the R command
devtools::create("MYPKG")
.R/
for R code, (2) A fileNAMESPACE
which will (later) be populated with function and requirement names, (3) A fileDESCRIPTION
for required package meta-data, and (4) An RStudio project fileMYPKG.Rproj
.Next, navigate into the package directory and set it up as a Git repository using
git init
(note: this is not strictly necessary but is a good practice for any project).git add
,git commit
,git push
) along with GitHub, as the latter will provide public access to the package revision history via GitHub's site.Step III: Custom Functions and Documentation
Following one of several style guides, set up custom functions in several distinct
.R
files in theR/
subdirectory of the package; best practices involve thematically organizing functions into distinct.R
files.In particular, I recommend following the stylistic advice in Hadley Wickham's comprehensive book R packages (this book also contains a wealth of other information and tips for writing R packages).
After setting up the desired functions in
.R
files, add in the minimal required documentation forroxygen2
.Note that documentation must be added in front of each defined function in
.R
files, using the required format for theroxygen2
package; this saves time in the long run by allowing auto-generation of manual pages.Step IV: Unit Testing with
testthat
+devtools
Formal automated testing of code is an important step in ensuring that work is reproducible - specifically, unit testing ensures that code contains fewer bugs, is more robust, and is structured more clearly.
To begin writing unit tests for a package, in the package directory, run
devtools::use_testthat()
. This will create a subdirectorytests/testthat
to store individual unit tests for each function, as well as a filetests/testthat.R
to perform all tests when runningR CMD check
.Write individual test files for each function in the package, with multiple
test_that
statements checking various use cases. For advice on organizing/writing tests, see the this helpful chapter by Hadley Wickham.After writing appropriate tests for each function in the
tests/testthat
subdirectory, ensure that all tests are working by usingdevtools::test()
.Repeat the above step as necessary to remove any problems brought to light in the testing process. Once
devtools::test()
runs successfully without catching any errors, move on to the final steps of building and releasing the package.Step V: Documentation and Building the Package
Once all desired custom functions, and proper comments for documentation, have been set up in the
.R
files in theR/
subdirectory, usedevtools::document()
to generate package documentation and manual.The use of
devtools::document()
will generate (1) A subdirectoryman/
for manual pages (.Rd
files), and (2) a number of.Rd
files (one for each, function), all of which may be found in theman/
subdirectory.After the documentation has been properly generated, the package can now be built and tested: in R, use
devtools::build()
while in the main package directory; this will produce a zipped version of the package in the parent directory (this can also be done from the command line withR CMD build MYPKG
).To ensure that the package is working appropriately, use either (1)
R CMD check MYPKG.tar.gz
on the built version of the package; or (2) while in the package directory, from R, rundevtools::check()
.Ensure that the package is working as intended by resolving all issues marked as WARNING or ERROR in the results produced by running the check.
Step VI: Publish the Package to GitHub and CRAN
Assuming that Git was used with the repository, the package will be available from GitHub, and may be installed using
devtools::install_github("USER/REPO")
within R.Submit the package to CRAN (this can also be done with
devtools::submit_cran()
in R); after it is accepted, the package will be available for download withinstall.packages("MYPKG")
.Useful Commands for Building/Publishing R Packages
devtools::create("MYPKG")
- generates a package skeleton as described above.devtools::document()
- generates package documentation using theroxygen2
style comments preceding each function in the various.R
files.devtools::use_build_ignore("FILES")
- adds named files to.Rbuildignore
with proper syntax. This is necessary for files not approved by CRAN.devtools::use_testthat()
- adds a subdirectorytests/testthat
for writing individual tests and a filetests/testthat.R
to run all tests whenR CMD check
is used.devtools::use_travis()
- adds a.travis.yml
config file to the repository to be used with Travis CI.devtools::test()
- runs all of the available tests that are present in the subdirectorytests/testthat
to ensure that any functions with tests are working as intended.devtools::check()
- builds the package and performs necessary checks to ensure that everything is running smoothly (or points out errors). This is a bit more thorough thanR CMD check
.devtools::build()
- generates the package manual and compiles other necessary aspects, ultimately resulting in a zipped (.tar.gz
) package file.devtools::build_win()
- builds and submits the package to CRAN win-builder for checking, with a status report generated roughly 20 minutes later. This conveniently checks against r-devel.devtools::release()
- builds the package, performsR CMD check
, asks various questions, then uploads the bundle to CRAN. _Preferable todevtools::submit_cran()
since this is more thorough_.devtools::submit_cran()
- builds and submits the package to CRAN, avoiding the (annoying) interface.R CMD build MYPKG
- (from the command line) builds the package when run in the parent directory, generating a zipped (.tar.gz
) package file.R CMD check MYPKG.tar.gz
- (from the command line) runs necessary checks on a built package, pointing out any warnings and errors that need correction.R CMD check --as-cran MYPKG
- (from the command line) runs checks similar to the above but with additional requirements specific to CRAN that are necessary for successful submission.Further reading/resources
Hadley Wickham's book R packages
Karl Broman's minimal "R Package Primer"
Hilary Parker's "Writing an R Package from Scratch"