Prototype for set_inits()

venpopov commented 5 months ago

As discussed in #1645, here is a prototype for a set_inits() function modeled after set_prior()

Here's a functioning example, setting inits on Intercept and b_sigma:

library(brms)

formula <- bf(count ~ Trt * Age,
              sigma ~ Trt * Age)

inits <- set_inits('normal(0, 1)', class = "Intercept", dpar = "mu") +
  set_inits('uniform(-0.1, 0.1)', class = "b", dpar = "sigma")

inits
#>         distribution     class coef group  dpar nlpar
#> 1       normal(0, 1) Intercept                       
#> 2 uniform(-0.1, 0.1)         b            sigma

fit <- brm(formula, epilepsy, init = inits, cores = 2, chains = 2, refresh = 0, backend = 'cmdstanr')
#> Start sampling
#> Init values were only set for a subset of parameters. 
#> Missing init values for the following parameters:
#>  - chain 1: b, Intercept_sigma
#>  - chain 2: b, Intercept_sigma
#> 
#> To disable this message use options(cmdstanr_warn_inits = FALSE).
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 2 finished in 1.5 seconds.
#> Chain 1 finished in 1.5 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 1.5 seconds.
#> Total execution time: 1.6 seconds.

str(fit$stan_args$init)
#> List of 2
#>  $ :List of 2
#>   ..$ Intercept: num 0.717
#>   ..$ b_sigma  : num [1:3(1d)] -0.0496 0.0666 -0.0425
#>  $ :List of 2
#>   ..$ Intercept: num -0.784
#>   ..$ b_sigma  : num [1:3(1d)] -0.029 0.0712 0.077

^{Created on 2024-04-11 with reprex v2.1.0}

Implementation

set_inits() takes arguments following the set_prior() syntax and creates a data.frame of class brmsinits
outputs of set_inits() can be added together just like for set_prior()
in brm() there is now the following piece of new code:

    # generate inits
    if (is.brmsinits(init)) {
      init <- replicate(
        chains,
        .inits_fun(init, bterms = bterms, data = data, sdata = sdata),
        simplify = FALSE
      )
    }

which calls .inits_fun() internal function to generate a list of inits from the specification of the brmsinits object

.inits_fun() takes as argument the brmsinits object, the bterms object, the data and the standata
it calls an internal function par_info() which generates lists of type:

list(b_type = "vector", b_dim_name = "Kc_sigma", b_par = "b_sigma")

by using some of the code from stan_predictor

it then parses the distributions specified in brmsinits object, and generates random values with the appropriate dimensions and returns a named list that can then be passed to Stan

Limitations and generalization

This currently only works for the Intercept and b type parameters for all population parameters and families. It does not work for random effects sd and for the z vector. This is where I would appreciate your help to generalize this (the current code uses a par_info_fe() function to generate the information for the fixed effects. Similar functions would be needed for the random effects and other special parameters

Tests

I added tests for all the internal helper functions, and also a test for brm with set_inits() in the tests/local folder

Additional features

Would be nice to have

[ ] an exported function staninits, which takes the formula, data and set_inits and generates the inits list
[ ] fix some inits to a constant (perhaps with the same constant(x) syntax as for the prior
[ ] an exported function default_inits, similar to default_prior, which shows the structure of how to specify inits for the valid parameters. the default values themselves are nto very informative - uniform(-2,2) on the unconstained scale, but the point is to see which parameters you can specify inits for

GidonFrischkorn commented 5 months ago

This is a great addition to brms! I wanted to add that some users might want to set lb and ub similar to the priors. So, adding this as an option and also making the set_inits consistent with set_prior would be great.

But, I also saw that @venpopov included a uniform distribution as an init distribution. So this might already be one option to include lower and upper bounds. So maybe, this could be added in a vignette to tell users how to set bounds using the set_inits function.

paul-buerkner commented 5 months ago

Thank you a lot @venpopov! Currently, I am a bit overwhelmed with things so it may take a while until I get back to this issue. But I highly appreciate your work and I am confident it will provide a good basis for making this feature work generally for any kinds of brms models.

venpopov commented 5 months ago

I understand. Should I keep working on this if I want to add things, or would you prefer me to leave it as is until you get a chance to look over it?

paul-buerkner commented 5 months ago

I don't know yet. let me think about it

Ven Popov @.***> schrieb am Sa., 13. Apr. 2024, 13:32:

I understand. Should I keep working on this if I want to add things, or would you prefer me to leave it as is until you get a chance to look over it?

— Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/pull/1646#issuecomment-2053619971, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2AEFYORGS6RYVZ42ONTY5EJVTAVCNFSM6AAAAABGCROVFCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTGYYTSOJXGE . You are receiving this because you commented.Message ID: @.***>

paul-buerkner commented 5 months ago

I like the general outline of the functionality. Especially the user interface is very pretty I think.

As for the par_info. Perhaps we can abstract the functionality of mapping to the Stan parameters to be reusable for both priors (stan code generation) and setting inits. If we just copy the relevant code, I worry this will become hard to maintain with new terms being potentially added in the future. I will think about it a bit more.

venpopov commented 5 months ago

Glad you like the general design! I totally agree we should abstract/refactor common functionality - to make it easier to maintain, less error prone, allow for easier future extension. I didn't do it in the prototype because I wanted to make something simple that works without messing around and potentially breaking the existing code. But we are on the same page about the need for abstraction. I just saw your email, will respond there in a bit

venpopov commented 5 months ago

A brief summary of what we discussed.

par_info:

par_info should return an object that we can pass to the functions that generate the parameters block of the stan code
for now impelement it for fixed and random effects and can temporarily break the code for other parameter types to see if it works to pass the object and use the info from it to reduce code duplication
return a data.frame with a column that specifies the stan block information should go in. For now just for parameters, but eventually for all variables defined in stan (transformed parameters, generated quantiles, etc)
eventually this will be stored in a brmsdata object

quality control:

need to add a check whether the provided set_inits information matches parameters, give an error otherwise
need to double check whether rstan and cmdstanr use the inits on the contrained space

UI:

set_inits should accept different types of first argument: characters for common distributions (like now), distributional objects, functions that take a single argument of N and output a vector of N values, constants or numeric vectors or list of vectors
an exported function like default_inits or inits_summary or inits_info that gives information about the structure just like default_prior

venpopov commented 5 months ago

Just a quick update from my side - unfortunately I didn't find the time to work on this and I'll be on vacation for 2 weeks. I'll get back to it when I come back

paul-buerkner / brms