nhejazi / talk_rpkgs_intro

:speech_balloon: Talk and Tutorial: "Developing R Packages: 'Good Enough' Practices for Applied Statistics"
MIT License
1 stars 0 forks source link

R package digest #3

Closed nhejazi closed 4 years ago

nhejazi commented 4 years ago

Last Updated: 26 June 2016

This is a refresher and/or crash course on the essentials of building packages for the R language and programming environment, complete with the documentation necessary for distributing these packages on GitHub and CRAN.

Rather than aiming to build a perfect R package, this tutorial aims to provide only the minimal details necessary for building a functional package. The main goal is to create a repository for custom functions, complete with necessary documentation to make the package useful to others, and to publish the package on CRAN. This tutorial should result in a package that is a collection of custom functions that can be relied on to save you time and improve the reproducibility of your data analytic endeavors.


Why Package R Code?


Step I: Necessary Tools and Dependencies


Step II: Building the Package Repository

  1. Navigate to the parent directory of the package you would like to create (using cd DIR or setwd("DIR").

  2. Build the skeleton for the new package -- generating a new directory in the process -- using the R command devtools::create("MYPKG").

    • The above will generate a directory with: (1) A subdirectory R/ for R code, (2) A file NAMESPACE which will (later) be populated with function and requirement names, (3) A file DESCRIPTION for required package meta-data, and (4) An RStudio project file MYPKG.Rproj.
  3. Next, navigate into the package directory and set it up as a Git repository using git init (note: this is not strictly necessary but is a good practice for any project).

    • Make sure to regularly use Git version control with the repository contents (git add, git commit, git push) along with GitHub, as the latter will provide public access to the package revision history via GitHub's site.

Step III: Custom Functions and Documentation


Step IV: Unit Testing with testthat + devtools

  1. Formal automated testing of code is an important step in ensuring that work is reproducible - specifically, unit testing ensures that code contains fewer bugs, is more robust, and is structured more clearly.

  2. To begin writing unit tests for a package, in the package directory, run devtools::use_testthat(). This will create a subdirectory tests/testthat to store individual unit tests for each function, as well as a file tests/testthat.R to perform all tests when running R CMD check.

  3. Write individual test files for each function in the package, with multiple test_that statements checking various use cases. For advice on organizing/writing tests, see the this helpful chapter by Hadley Wickham.

  4. After writing appropriate tests for each function in the tests/testthat subdirectory, ensure that all tests are working by using devtools::test().

  5. Repeat the above step as necessary to remove any problems brought to light in the testing process. Once devtools::test() runs successfully without catching any errors, move on to the final steps of building and releasing the package.


Step V: Documentation and Building the Package

  1. Once all desired custom functions, and proper comments for documentation, have been set up in the .R files in the R/ subdirectory, use devtools::document() to generate package documentation and manual.

  2. The use of devtools::document() will generate (1) A subdirectory man/ for manual pages (.Rd files), and (2) a number of .Rd files (one for each, function), all of which may be found in the man/ subdirectory.

  3. After the documentation has been properly generated, the package can now be built and tested: in R, use devtools::build() while in the main package directory; this will produce a zipped version of the package in the parent directory (this can also be done from the command line with R CMD build MYPKG).

  4. To ensure that the package is working appropriately, use either (1) R CMD check MYPKG.tar.gz on the built version of the package; or (2) while in the package directory, from R, run devtools::check().

  5. Ensure that the package is working as intended by resolving all issues marked as WARNING or ERROR in the results produced by running the check.


Step VI: Publish the Package to GitHub and CRAN

  1. Assuming that Git was used with the repository, the package will be available from GitHub, and may be installed using devtools::install_github("USER/REPO") within R.

  2. Submit the package to CRAN (this can also be done with devtools::submit_cran() in R); after it is accepted, the package will be available for download with install.packages("MYPKG").


Useful Commands for Building/Publishing R Packages


Further reading/resources