virtualstaticvoid / heroku-buildpack-r

Heroku buildpack for R - Makes deploying R on Heroku easy
MIT License
304 stars 235 forks source link
heroku heroku-buildpack multiple-buildpacks packrat plumber plumber-applications r renv shiny shiny-applications

Heroku Buildpack: R

CI

This is a Heroku Buildpack for applications which use R for statistical computing and CRAN for R packages.

The buildpack supports the heroku-18[^18support], heroku-20[^20support] and heroku-22[^22support] stacks.

It also includes support for the Packrat and renv package managers, and the Shiny and Plumber web application frameworks.

Usage

The buildpack's name is vsv/heroku-buildpack-r. Provide it when creating your application on Heroku as follows:

heroku create --buildpack vsv/heroku-buildpack-r

You can add it to an existing application using the buildpacks:add command, as follows:

heroku buildpacks:add vsv/heroku-buildpack-r

Alternatively, you can use the Git URL of this repository, together with the branch name.

https://github.com/virtualstaticvoid/heroku-buildpack-r.git#main

The buildpack will detect your application makes use of R if it has one (or more) of the following files in the project directory:

If the init.R file is provided, it will be executed in order to install any R packages, and if packrat/init.R or renv/activate.R files are found, the respective package manager will be bootstrapped and packages installed.

Additionally:

See the detect script for the matching logic used.

Installing R Packages

The init.R file is used to install R packages as required.

NOTE: Using either Packrat or renv are a better way to manage your package dependencies and their respective versions, so the init.R file isn't required if you use packrat or renv.

The following example init.R file can be used. Provide the package names you want to install to the my_packages list variable:

# init.R
#
# Example R code to install packages if not already installed
#

my_packages = c("package_name_1", "package_name_2", ...)

install_if_missing = function(p) {
  if (p %in% rownames(installed.packages()) == FALSE) {
    install.packages(p, clean=TRUE, quiet=TRUE)
  }
}

invisible(sapply(my_packages, install_if_missing))

R packages can also be installed by providing a .tar.gz package archive file, if a specific version is required, or it is not a publicly published package. See local-packages for an example.

# init.R
#
# Example R program to installed package from local path
#

install.packages("PackageName-Version.tar.gz", repos=NULL, type="source")

NOTE: The path to the package archive should be a relative path to the project root directory, so that it works locally in development and during deployment on Heroku.

R Package Installation Helper

For convenience, a R helper function, helpers.installPackages, is included by the buildpack to make installing packages easier.

Thus the init.R file can be reduced to a single line of R code as shown. Provide the package names you want to install as arguments to the helper:

helpers.installPackages("package_name_1", "package_name_2", ...)

Installing Binary Dependencies

This version of the buildpack still supports the use of an Aptfile for installing additional system packages, however this functionality is going to be deprecated in future as it isn't a foolproof solution.

It is based on the same technique as used by the heroku-buildpack-apt buildpack to install Ubuntu packages using apt-get.

There are various technical and security reasons why it is no longer recommended, so your mileage may vary.

If any of your R packages dependend on system libraries which aren't included by Heroku, such as libgmp, libgomp, libgdal, libgeos and libgsl, you should use the Heroku container stack together with heroku-docker-r instead.

R Applications

Heroku Console

You can run the R console application as follows:

$ heroku run R ...

Type q() to exit the console when you are finished.

You can also run the Rscript utility as follows:

$ heroku run Rscript ...

Note that the Heroku slug is read-only, so any changes you make during the session will be lost.

Shiny Applications

Shiny applications must provide a run.R file, and can also include an init.R in order to install additional R packages. The Shiny package does not need to be installed, as it is included in the buildpack already.

The run.R file should contain at least the following code, in order to run the web application.

Notice the use of the PORT environment variable, provided by Heroku, which is used to configure Shiny and the host must be 0.0.0.0.

# run.R
library(shiny)

port <- Sys.getenv('PORT')

shiny::runApp(
  appDir = getwd(),
  host = '0.0.0.0',
  port = as.numeric(port)
)

See the virtualstaticvoid/heroku-shiny-app example application.

Plumber Applications

Plumber applications must provide an app.R file, but can also include an init.R in order to install additional R packages. The Plumber package does not need to be installed, as it is included in the buildpack already.

The app.R file should contain at least the following code, in order to run the web application.

Notice the use of the PORT environment variable, provided by Heroku, which is used to configure Shiny and the host must be 0.0.0.0.

# app.R
library(plumber)

port <- Sys.getenv('PORT')

server <- plumb("plumber.R")

server$run(
  host = '0.0.0.0',
  port = as.numeric(port)
)

See the virtualstaticvoid/heroku-plumber-app example application.

Recurring Jobs

You can use the Heroku scheduler to schedule a recurring R process.

An example command for the scheduler to run prog.R, would be R --file=prog.R --gui-none --no-save.

Technical Details

R Versions

The default R version can be overridden by setting the R_VERSION environment variable.

heroku config:set R_VERSION=4.0.0

The following table lists the available combinations of Heroku Stack and R version. They are built periodically as and when the Debian R packages are available.

R / Stack 18 20 22
3.6.3
4.0.0
4.0.5
4.1.2
4.1.3 -
4.2.0 -
4.2.1

Legend:

Slug Compilation vs Runtime use of chroot

This version of the buildpack still uses a fakechroot during slug compilation, to compile R packages which may include C or Fortran code. However it no longer uses the chroot at runtime so it can work better in scenarios where other language buildpacks are used, such as with Python, Ruby or Java, and so that the slug size is greatly reduced.

If you are migrating to this version of the buildpack, you no longer need to prefix commands to use fakechroot, fakeroot or chroot. Wrappers of these commands are included and they will output warning messages to alert you of their use.

Buildpack Binaries

The binaries used by the buildpack are hosted on AWS S3 at https://heroku-buildpack-r.s3.amazonaws.com.

See the heroku-buildpack-r-build2 repository for building the buildpack binaries yourself.

Process Types

The buildpack includes the following default process types:

The R and Rscript executables are available like any other executable, via the heroku run command.

Procfile

You can include a Procfile in your project if you want to override the default process types and/or their command lines. This is typically required if you are using multiple buildpacks.

For example, the following Profile defines the commands for the web and console processes.

web: R --file=myprogram.R --gui-none --no-save
console: R --no-save

heroku.yml

You can include the heroku.yml build manifest in you project if you want to override the default process types and/or their command lines, within the run section. This is typically required if you are using multiple buildpacks.

For example, the following heroku.yml file defines the commands for the web and console processes.

run:
  web: R --file=myprogram.R --gui-none --no-save
  console: R --no-save

Paths

Where possible, always use relative paths for files, so that your application is more portable; so that it can run locally in development and at runtime on Heroku without any differences.

The current directory on Heroku will always be /app and your application will be installed to this directory, so relative paths should be in respect of the root directory of your project.

If you need to use absolute paths, consider using getwd() together with file.path() to build up the path instead of hardcoding them.

.Rprofile

You can include an .Rprofile in your application's root directory and it will be executed at the start of any R process.

It can be used as a convenient way to bootstrap your application, sourcing common utilities or performing configuration tasks.

Please do not use it to install R packages, since it may cause problems during deployment (slug compilation) and will fail at runtime.

CRAN Mirror Override

It is possible to override the default CRAN mirror used, by providing the URL via the CRAN_MIRROR environment variable.

E.g. Override the URL by setting the variable as follows.

heroku config:set CRAN_MIRROR=https://cloud.r-project.org/

Check the CRAN mirror status page to ensure the mirror is available.

Caching

To improve the time it takes to deploy the buildpack caches the R binaries and installed R packages.

If you need to purge the cache, it is possible by using heroku-repo CLI plugin via the heroku repo:purge_cache command.

See the purge-cache documentation for more information.

Build Output Verbosity

In previous versions of the buildpack, the full output of install.packages() was emitted during slug compilation, which lead to very verbose output and made it hard to spot issues in some instances. Packages are now installed with quiet=TRUE option set by default.

To restore the previous behaviour, set the PACKAGE_INSTALL_VERBOSE environment variable to a value of 1 before deploying your application:

heroku config:set PACKAGE_INSTALL_VERBOSE=1

To revert the setting use the config:unset command:

heroku config:unset PACKAGE_INSTALL_VERBOSE

Hacking

To enable debug outputs during deployment, set the BUILDPACK_DEBUG environment variable.

heroku config:set BUILDPACK_DEBUG=1

Credits

License

MIT License. Copyright (c) 2020 Chris Stefano. See LICENSE for details.

[^18support]: The Heroku-18 stack is deprecated and will reach end-of-life on April 30th, 2023. [^20support]: Heroku-20 is based on Ubuntu 20.04. It will be supported through April 2025. [^22support]: Heroku-22 is based on Ubuntu 22.04. It will be supported through April 2027.