epiforecasts / EpiNow2

Estimate Realtime Case Counts and Time-varying Epidemiological Parameters
https://epiforecasts.io/EpiNow2/dev/
Other
112 stars 32 forks source link

Move Stan options to `stan_args` and process pre-fit model(s) #131

Closed hsbadr closed 3 years ago

hsbadr commented 3 years ago

It would cleaner to move all Stan options (including method and model) to the list of stan_args:

stan_args = list(
  backend = "rstan", # Supported options: "rstan" (default) or "cmdstan" (PR #128)
  method  = "NUTS",  # Supported options: "NUTS"  (default) or "ADVI" (Variational Inference)
  model   =  NULL,   # Supported options:  NULL   (default) or a compiled Stan model
  fit     =  NULL,   # Supported options:  NULL   (default) or a `stanfit` object or a list of `stanfit` objects
  ...
)

The fit option can be used to replace the internal time-consuming model fitting with a pre-fit stanfit object or list of stanfit objects (e.g., running multiple chains separately with a specific chain_id to be merged internally using rstan::sflist2stanfit(). This would allow reusing fit(s) from EpiNow2 outputs, for example to increase the number of chains with the same configurations.

seabbs commented 3 years ago

Agree. Thought about this and now very much on board. I am strongly in favour of the passed in fit or list of fit objects as well. At some point we are interested in initialising a new run from an old run etc and potentially this could be a good approach.

Only caveat to all this is need to make sure this is all well documented as it's important users find these options.

hsbadr commented 3 years ago

I'd use algorithm = c("sampling", "meanfield", "fullrank") instead of method where sampling refers to NUTS/MCMC algorithm and both meanfield (variational inference with independent distributions, uses a fully factorized distribution for the approximation) and fullrank (variational inference with a multivariate distribution, uses a distribution with a full-rank covariance matrix for the approximation) are different algorithms for ADVI. So, the stan_args list would be:

stan_args   = list(
  algorithm = "sampling",  # Supported options: "sampling"  (default) for NUTS/MCMC, or "meanfield" / "fullrank" for Variational Inference
  backend   = "rstan",     # Supported options: "rstan"     (default) or "cmdstan" (PR #128)
  fit       =  NULL,       # Supported options:  NULL       (default) or a `stanfit` object or a list of `stanfit` objects
  model     =  NULL,       # Supported options:  NULL       (default) or a compiled Stan model
  ...
)

The optional/unsupported Stan arguments should be nullified before calling the fitting function.

seabbs commented 3 years ago

Thanks @hsbadr agree on all I think though perhaps fit should continue to be its own argument and be logical (I am not sure why you would want a lit of stanfits.

hsbadr commented 3 years ago

@seabbs I found it much more efficient to fit each chain separately for each region and run all chains for all regions in parallel. So, what I suggest here is to allow the user to provide a list of stanfits for multiple chains to be merged in one fit using rstan::sflist2stanfit() and processed by EpiNow2 instead of fitting a new model.

seabbs commented 3 years ago

Got it sounds good.

seabbs commented 3 years ago

@hsbadr going to close this as all stan args have been moved but open a new issue for handling preprocessed fits.