andrjohns / StanEstimators

Estimate Parameters for Arbitrary R Functions using 'Stan'
Other
20 stars 1 forks source link

Request for compatibility with TMB objects #3

Closed Cole-Monnahan-NOAA closed 6 months ago

Cole-Monnahan-NOAA commented 6 months ago

Hi @andrjohns this package looks amazing. I think it would pair extremely well with the newish package RTMB. This package allows the user to write statistical models which return a negative log-likelihood value in R only and then efficiently (using C++) calculates the objective function and gradients using autodiff using the TMB backend.

TMB has a suite of tools that would play nicely with StanEstimators and give the user base an expanded toolset: It can apply the Laplace approximation to arbitrary subsets of parameters, automatically detect and use sparseness (e.g., SPDE models), calculate approximate standard errors for functions of parameters, etc. Most importantly is that the user would not have to specify analytical gradients b/c autodiff is applied to the R function.

Compatibility with RTMB would also help the TMB user base because their models can currently only access Stan functionality (NUTS) through rstan and tmbstan. This would open up a new optimizer option, but also ADVI and pre-specified inverse metrics (metric_file) used by CmdStan but not rstan. I'm particularly intrigued by the last point because TMB can return a "joint precision" matrix (fixed and random effects) for mixed models obtained after optimization and which would make a rational starting place for a dense metric for highly correlated posterior surfaces (see details of the help file here).

It would seem straightforward to pair the two, since RTMB builds an object in R with a function to return the negative log-likelihood and it's gradient with respect to the parameter vector. I would assume one could plug those into the StanEstimator functions but I got an error for both MCMC and optimization. Below is a short reprex based on your simple example built in RTMB.

Could you take a look and see why this might be erroring out? One major difference is that the obj is built using the data in the global workspace. @kaskr do you have any insight why the TMB-derived R functions may not behave the same as those written strictly in R?

It would be great to get these two packages working. Let me know if I can help test or dig further.

Thanks!

## install.packages('RTMB')
## remotes::install_github("andrjohns/StanEstimators")
library(RTMB)
library(StanEstimators)
## simulate data
set.seed(121)
y <- rnorm(500, 10, 2)
inits <- c(0, 5)
## log-likelihood and analytical gradient functions
loglik_fun <- function(v, x) {
  sum(dnorm(x, v[1], v[2], log = TRUE))
}
grad <- function(v, x) {
  inv_sigma <- 1 / v[2]
  y_scaled = (x - v[1]) * inv_sigma
  scaled_diff = inv_sigma * y_scaled
  c(sum(scaled_diff),
    sum(inv_sigma * (y_scaled*y_scaled) - inv_sigma)
  )
}

## equivalent in RTMB
## Try it with RTMB object which takes a list for parameters
pars <- list(v=inits)
negloglik_fun <- function(pars) #-loglik_fun(pars$v,x=y)
  -sum(dnorm(y, pars$v[1], pars$v[2], log = TRUE))
## Make TMB object
obj <- MakeADFun(negloglik_fun, parameters=pars, silent=TRUE)
## objective and gradient at initial values match except sign,
## calculated using C++ and uses autodiff
f <- function(v) -obj$fn(v)
g <- function(v) -as.numeric(obj$gr(v))
## Test the same (yes!)
f(inits)
loglik_fun(inits,y)
g(inits)
grad(inits,y)

## Compare MCMC, optimization, and ADVI
fit_fd <- stan_sample(loglik_fun, inits, additional_args = list(y),
                      lower = c(-Inf, 0), # Enforce a positivity constraint for SD
                      num_chains = 1, seed = 1234)
fit_grad <- stan_sample(loglik_fun, inits, additional_args = list(y),
                        grad_fun=grad,
                        lower = c(-Inf, 0), # Enforce a positivity constraint for SD
                        num_chains = 1, seed = 1234)
fit_tmb <- stan_sample(f, inits,
                       grad_fun=g,
                       lower = c(-Inf, 0), # Enforce a positivity constraint for SD
                       num_chains = 1, seed = 1234)
andrjohns commented 6 months ago

Great to hear that you like the package!

Oh that's a great idea, thanks for the suggestion! I believe the error is caused by the package running each chain in a separate background R process (since multiple chains can't access the same host R session for function evaluation in parallel), and there are functions/objects that need to be exported.

It also looks like the error reporting/handling needs some work, so that a more informative error is thrown when failures happen.

I'll have a look into some different ways of handling this and let you know, thanks again!

Cole-Monnahan-NOAA commented 6 months ago

Yeah we probably would need to build the TMB object inside the separate R processes. I'm not sure how we'd structure the interface in StanEstimators but something like an extra argument to stan_sample called tmblist which is passed to build_stan_call and added to the args output in slot args$tmblist which does nothing when NULL, but when a list does the following inside call_stan_impl at teh beginning

if(!is.null(options_vector$tmblist){
  library(RTMB)
  tmblist <- list(func=negloglik_fun, parameters=pars, datlist=list(y=y))
  attach(tmblist$datlist)
  obj <- do.call(MakeADFun, tmblist)
  ll_fun <- function(v) -obj$fn(v)
  grad_fun <- function(v) -as.numeric(obj$gr(v))
}

These functions would then be built locally and get passed downstream to the Rcpp interface. I don't know what would happen at this call though

internal::ll_fun = Rcpp::Function(ll_fun);

It's probably worth giving @kaskr a chance to think/respond to this. The Rcpp stuff is way over my head unfortunately. I can help implement and test if needed.

andrjohns commented 6 months ago

I've now integrated the future package's methods for identifying and exporting global variables and package namespaces, so this will be handled automatically for most functions.

Unfortunately, the future package does not seem to be able to detect all of the captured variables when the RTMB package is used, so you'll need to specify these explicitly:

library(RTMB)
library(StanEstimators)

set.seed(1234)
y <- rnorm(500, 10, 2)
inits <- c(0, 5)

pars <- list(v=inits)
negloglik_fun <- function(pars) {
  -sum(dnorm(y, pars$v[1], pars$v[2], log = TRUE))
}
obj <- MakeADFun(negloglik_fun, parameters=pars, silent=TRUE)

f <- function(v) { -obj$fn(v) }
g <- function(v) -as.numeric(obj$gr(v))

samp_fd <- stan_sample(f, inits, lower = c(-Inf, 0), grad_fun = g,
                       num_chains = 4, parallel_chains = 4,
                       seed = 1234,
                       globals = list(y = y, obj = obj),
                       packages = "RTMB")

Alternatively, I've also added an eval_standalone argument to all methods, which controls whether to launch a separate R session for the evaluation. If you set this to FALSE, the function will be evaluated in the current R session and no exporting/setup is needed.

The only drawback is that you can't run multiple chains in parallel.

samp_fd <- stan_sample(f, inits, lower = c(-Inf, 0),
                       grad_fun = g,
                       num_chains = 4, parallel_chains = 1,
                       seed = 1234,
                       eval_standalone = FALSE)

Let me know how it all works for you!

andrjohns commented 6 months ago

I've updated the behaviour of all methods to now default to eval_standalone = FALSE, so any RTMB functions will work automatically for optimize/variational/pathfinder/laplace.

For sample, it will default to eval_standalone = FALSE when parallel_chains = 1 (the default)

I'm going to close this issue now since all seems to be working, but thanks for letting me know about RTMB! I'll be (eventually) adding it as an option for an autodiff backend (#4)

Cole-Monnahan-NOAA commented 6 months ago

@andrjohns this is great thank you for putting the time in. It runs locally for me with one chain, but when I run your example with multiple parallel chains (as written above) it runs fine, but errors out at the end. I get this output

Chain 1:  Elapsed Time: 0.342 seconds (Warm-up)
Chain 1:                0.324 seconds (Sampling)
Chain 1:                0.666 seconds (Total)
Error: Warning message:

with the following traceback:

2: stop(paste0(errs, collapse = "\n"), call. = FALSE)
1: stan_sample(f, inits, lower = c(-Inf, 0), grad_fun = g, num_chains = 4, 
       parallel_chains = 4, seed = 1234, globals = list(y = y, obj = obj), 
       packages = "RTMB")

This is using a fresh install of the main branch, which is where it appears you have made these changes. Am I doing something wrong?

andrjohns commented 6 months ago

I can't replicate that locally, but I've just pushed a fix which ignores blank lines returned by the callr stderr. Can you try the current main?

Also, can you try with num_chains = 4, parallel_chains = 1 and see if any warnings/errors are printed when running in the current session?

Cole-Monnahan-NOAA commented 6 months ago

It works with multiple serial changes, but still fails in the same manner with parallel_chains=4. I debugged and found this

debug: stop(paste0(errs, collapse = "\n"), call. = FALSE)
Browse[2]> errs
[1] "Warning message:"                                "package 'RTMB' was built under R version 4.3.2 "

So looks like the warning is converted to an error when loading in the parallel process

andrjohns commented 6 months ago

Ah yes that would do it. Both warnings and errors are sent to stderr, so callr returns both as errors via $read_error_lines().

I've changed this to just use message() rather than stop(), so it doesn't unnecessarily break things

Cole-Monnahan-NOAA commented 6 months ago

OK it now works. I also confirmed that stan_variational ,stan_laplace and stan_optimize work as expected and match the specified model. Thank you so much for the quick turnaround!