Similar to rstantools
for
rstan
, the instantiate
package builds
pre-compiled CmdStan
models into CRAN-ready statistical modeling R packages. The models
compile once during installation, the executables live inside the file
systems of their respective packages, and users have the full power and
convenience of CmdStanR
without any
additional compilation after package installation. This approach saves
time and helps R package developers migrate from
rstan
to the more modern
CmdStanR
.
The website at https://wlandau.github.io/instantiate/ includes a function reference and other documentation.
instantiate
The instantiate
package depends on the R package
CmdStanR
and the command line tool
CmdStan
, so it is
important to follow these stages in order:
CmdStanR
.
CmdStanR
is not on CRAN, so the
recommended way to install it is
install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
.CMDSTAN_INSTALL
and/or
CMDSTAN
to manage the
CmdStan
installation. See the “Administering CmdStan” section below for
details.instantiate
using one of the R commands below.Type | Source | Command |
---|---|---|
Release | CRAN | install.packages("instantiate") |
Development | GitHub | remotes::install_github("wlandau/instantiate") |
Development | R-universe | install.packages("instantiate", repos = "https://wlandau.r-universe.dev") |
instantiate
Packages that use instantiate
may be published on CRAN. CRAN does not
have CmdStan
, so the models are not pre-compiled in the Mac OS and
Windows binaries. If you install from CRAN, please install from the
source. For example:
install.packages("hdbayes", type = "source")
The instantiate
package uses environment variables to manage the
installation of
CmdStan
. An
environment variable is an operating system setting with a name and a
value (both text strings). In R, there are two ways to set environment
variables:
Sys.setenv()
, which sets environment variables temporarily for the
current R session..Renviron
text file in you home directory, which passes
environment variables to all new R sessions. the
edit_r_environ()
function from the usethis
package
helps.By default, instantiate
looks for the copy of
CmdStan
located at
cmdstanr::install_cmdstan()
. If you upgrade
CmdStan
, then the path
returned by cmdstanr::install_cmdstan()
will change, which may not be
desirable in some cases. To permanently lock the path that instantiate
uses, follow these steps:
CMDSTAN
environment variable to the desired path to
CmdStan
.CMDSTAN_INSTALL
environment variable to "fixed"
.instantiate
.Henceforth, instantiate
will automatically use the
CmdStan
path from (1),
regardless of the value of CMDSTAN
after (3). To prefer
cmdstanr::cmdstan_path()
instead, you could do one of the following:
instantiate
with CMDSTAN_INSTALL
not equal to "fixed"
,
orCMDSTAN_INSTALL
to "implicit"
at runtime, orcmdstan_install
argument to "implicit"
for the current
instantiate
package function you are using.The following section explains how to create an R package with
pre-compiled Stan models. This stage of the development workflow is
considered “runtime” for the purposes of administering
CmdStan
as described
previously.
Begin with an R package with one or more Stan model files inside the
src/stan/
directory. stan_package_create()
is a convenient way to
start.
stan_package_create(path = "package_folder")
#> Example package named "example" created at "package_folder". Run stan_package_configure(path = "package_folder") so that the built-in Stan model will compile when the package installs.
At minimum the package file structure should look something like this:
fs::dir_tree("package_folder")
#> package_folder
#> ├── DESCRIPTION
#> └── src
#> └── stan
#> └── bernoulli.stan
Configure the package so the Stan models compile during installation.
stan_package_configure()
writes required scripts cleanup
,
cleanup.win
, src/Makevars
, src/Makevars.win
, and
src/install.libs.R
. Inside src/install.libs.R
is a call to
instantiate::stan_package_compile()
which you can manually edit to
control how your models are compiled. For example, different calls to
stan_package_compile()
can compile different groups of models using
different C++ compiler flags.
fs::dir_tree("package_folder")
#> package_folder
#> ├── DESCRIPTION
#> ├── cleanup
#> ├── cleanup.win
#> └── src
#> ├── Makevars
#> ├── Makevars.win
#> ├── install.libs.R
#> └── stan
#> └── bernoulli.stan
Install the package just like you would any other R package. To install
it from your local copy of package_folder
, open R and run:
install.packages(pkgs = "package_folder", type = "source", repos = NULL)
A user can now run a model from the package without any additional
compilation. See the documentation of
CmdStanR
to learn how to
use CmdStanR
model objects.
library(example)
model <- stan_package_model(name = "bernoulli", package = "example")
print(model) # CmdStanR model object
#> data {
#> int<lower=0> N;
#> array[N] int<lower=0,upper=1> y;
#> }
#> parameters {
#> real<lower=0,upper=1> theta;
#> }
#> model {
#> theta ~ beta(1,1); // uniform prior on interval 0,1
#> y ~ bernoulli(theta);
#> }
fit <- model$sample(
data = list(N = 10, y = c(1, 0, 1, 0, 1, 0, 0, 0, 0, 0)),
refresh = 0,
iter_warmup = 2000,
iter_sampling = 4000
)
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.6 seconds.
fit$summary()
#> # A tibble: 2 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1 lp__ -8.15 -7.87 0.725 0.317 -9.60 -7.64 1.00 7365. 8498.
#> 2 theta 0.333 0.324 0.130 0.134 0.137 0.563 1.00 6229. 7560.
You can write an exported user-side function in your R package to access
the model. For example, you might store this code in a R/model.R
file
in the package:
#' @title Fit the Bernoulli model.
#' @export
#' @family models
#' @description Fit the Bernoulli Stan model and return posterior summaries.
#' @return A data frame of posterior summaries.
#' @param y Numeric vector of Bernoulli observations (zeroes and ones).
#' @param `...` Named arguments to the `sample()` method of CmdStan model
#' objects: <https://mc-stan.org/cmdstanr/reference/model-method-sample.html>
#' @examples
#' if (instantiate::stan_cmdstan_exists()) {
#' run_bernoulli_model(y = c(1, 0, 1, 0, 1, 0, 0, 0, 0, 0))
#' }
run_bernoulli_model <- function(y, ...) {
stopifnot(is.numeric(y) && all(y >= 0 & y <= 1))
model <- stan_package_model(name = "bernoulli", package = "mypackage")
fit <- model$sample(data = list(N = length(y), y = y), ...)
fit$summary()
}
DESCRIPTION
file, list
https://mc-stan.org/r-packages/ in the Additional_repositories:
field (example in
brms
).
This step is only necessary while
cmdstanr
is not yet on CRAN.Additional_repositories:
https://mc-stan.org/r-packages/
DESCRIPTION
and NAMESPACE
files, import the
instantiate
package and function stan_package_model()
.CmdStan
is too big
for CRAN, so instantiate
will not be
able to access it there. So if you plan to submit your package to
CRAN, please skip the appropriate code in your examples, vignettes,
and tests when instantiate::stan_cmdstan_exists()
is FALSE
.
Explicit if()
statements like the above one in the
roxygen2
@examples
work for
examples and vignettes. For tests, it is convenient to use
testthat::skip_if_not()
,
e.g. skip_if_not(stan_cmdstan_exists())
.pkgload::load_all()
might not compile your models. If you use
pkgload
or
devtools
to load and develop your
package, you may need to call instantiate::stan_package_compile()
from the root directory of your package to compile your models
manually..gitigore
file at the root
of your package:src/stan/**
!src/stan/**/*.*
src/stan/**/*.exe
src/stan/**/*.EXE
cmdstanr
-based installation
as explained above, and tweak your workflow YAML files as explained
in that section.Please note that the instantiate
project is released with a
Contributor Code of
Conduct.
By contributing to this project, you agree to abide by its terms.
To cite package ‘instantiate’ in publications use:
Landau WM (2023). _instantiate: A Minimal CmdStan Client for R Packages_.
https://wlandau.github.io/instantiate/, https://github.com/wlandau/instantiate.
A BibTeX entry for LaTeX users is
@Manual{,
title = {instantiate: A Minimal CmdStan Client for R Packages},
author = {William Michael Landau},
year = {2023},
note = {https://wlandau.github.io/instantiate/,
https://github.com/wlandau/instantiate},
}