stan-dev / cmdstanr

CmdStanR: the R interface to CmdStan
https://mc-stan.org/cmdstanr/
Other
143 stars 62 forks source link

Ability to convert a draw of parameters into something compatible with the init arg #776

Open syclik opened 1 year ago

syclik commented 1 year ago

Is your feature request related to a problem? Please describe.

I would like to take a draw from MCMC or an optimum and use it as an init in $sample(), $optimize(), or $variational().

In CmdStanPy, I am able to do this. Example (in pseudocode):

data = {...}
model = CmdStanModel(stan_file=<path>)
optimum = model.optimize(data=data)

fit = model.sample(data = data, 
                                 inits=optimum.stan_variables())   ## <-- this is what I want to do!

I was not able to do a similar thing with CmdStanR. The init parameter was not compatible with the output of draws (as far as I could see from the documentation and trying a bunch of things).

Describe the solution you'd like

Anything that would make this possible. I think it could be any of these options:

  1. a function to convert a single draw or an optimum into init format
  2. a method on a fit that provides an ability to push through an init
  3. change the function signature on $sample(), $optimize(), and $variational() to accept a draw as an init
  4. change the function signature on $sample(), $optimize(), and $variational() to accept a new argument that accepts a fit

or possibly something else.

Describe alternatives you've considered I can use CmdStanPy for this.

Additional context This request came up from StanCon 2023 in the building a gpt tutorial. I was able to do this in CmdStanPy and didn't realize it was very difficult to do in CmdStanR.

mike-lawrence commented 1 year ago

See here for example R code.

andrjohns commented 1 year ago

@WardBrian would you mind linking to where in the cmdstanpy source this extraction/conversion for initial values happens? I'm not at all familiar with the source to find it myself sorry!

WardBrian commented 1 year ago

It’s not specific to initial values, the code @syclik posted is just using the general stan_variables function we have on each output class which returns a dictionary of parameters. In the case of optimization, this is just the correct shape for initialization arguments out of the box

The specific code is here https://github.com/stan-dev/cmdstanpy/blob/1361f3585b8b7398b1556a338e8c4cac5900f0f3/cmdstanpy/stanfit/mle.py#L241 but is pretty tied up with the way cmdstanpy stores and reshapes draws in general

andrjohns commented 1 year ago

Ahh that's really helpful thanks. So the behaviour of passing the results as inits:

inits=optimum.stan_variables()

This is only possible for CmdStanMLE and CmdStanVB objects, where there is only a single value being returned for each parameter. It's not expected to be used for CmdStanMCMC objects, right?

WardBrian commented 1 year ago

Yeah, stan_variables returns different shapes for the different objects

mike-lawrence commented 1 year ago

Is there something unsatisfactory about the implementation I linked?

andrjohns commented 1 year ago

The implementation you linked is using the draws from the last iteration of sampling as the initial values for a new sampling run.

The cmdstanpy implementation is just a consequence of the fact that calling .stan_variables() on the result of optimisation returns the optimised values in a format compatible with the init argument. So it's only for optimisation, not sampling.

I don't think we want to provide a specific method for using the results of sampling as initial values without broader API/implementation discussions

jgabry commented 1 year ago

It's going to become increasingly important to have something like this once pathfinder is released, since one of the advertised use cases of pathfinder is initializing MCMC.

mike-lawrence commented 1 year ago

posterior::as_draws_rvars() seems to have done all the heavy lifting to convert variables to the array-like format that init expects, could that be a path forward? We'd need a way to convert from an rvar draw to an array, which I'm finding difficult at the moment...

jgabry commented 1 year ago

Yeah I think using rvar as an intermediate step might be a good option. Can you clarify what you mean by converting to an array here? For inits don't we need a list?

mike-lawrence commented 1 year ago

Yes, the init argument to mod$sample() expects a list of lists, where the outer list has one entry per chain, and the inner lists have named elements, one for each parameter for which inits are being supplied. So in theory one could do:

(
    fi$draws(format='df')
    #grab a random draw per chain:
    %>% slice_sample(1,.by=.chain)
    # now we have to make each chain a list
    %>% group_split(.chain)
    # and the format of each chain's entry in the list needs to be a list
    %>% map(
        .f = function(.x){
            (
                .x
                #converting from draws_df to draws_rvars
                %>% posterior::as_draws_rvars()
            )
        }
    )
)

And the resulting list indeed has the correct overall format, but at the leaves the format is draws_rvars, which cmdstanr doesn't know what to do with and yields an error that the value must be numeric.

jgabry commented 1 year ago

Ok yeah I see what you mean

mike-lawrence commented 1 year ago

Oh! I think posterior:::get_variables_from_one_draw() is what we need:

(
    fit$draws(format='df')
    %>% slice_sample(n=1,by=.chain)
    %>% group_split(.chain)
    %>% map(
        .f = function(.x){(
            .x
            %>% posterior::as_draws_rvars()
            %>% posterior:::get_variables_from_one_draw(1)
        )}
    )
)

edit: Yup, that seems to work.

jgabry commented 1 year ago

Awesome! Thanks for figuring that out. To put something like this in cmdstanr we'd need a version that doesn't use dplyr or unexported posterior functions. So we'd need to convert the first part of what you did to not use dplyr and then either export that function from posterior or reimplement it (it's pretty simple so wouldn't really be much code duplication).

mike-lawrence commented 1 year ago

I think it'd actually make sense to leave the first bits to the user, since I can imagine different scenarios requiring different approaches to selecting draws (when using as inits, sample randomly, when resuming from where we left off last, grab the last, etc). And maybe the easiest solution to using get_variables_from_one_draw is to have posterior export it?

jgabry commented 1 year ago

You're right, there are definitely different approaches that could be used depending on the situation. And yeah I assume it's not a problem for posterior to export that function.

jgabry commented 1 year ago

Based on https://github.com/stan-dev/cmdstanpy/issues/684 it looks like CmdStanPy is going to have something specific for pathfinder:

For supporting the initialization of MCMC, I think it is important to also have a method like

create_inits(self, seed = None, num_chains = 4)

This will return a list of dictionaries of 4 random draws from the pathfinder outputs in the format required for the inits argument to other methods. We can also consider adding CmdStanPathfinder as an option for the inits argument directly, in which case this method would just be called.

WardBrian commented 1 year ago

I think that it would probably make sense if cmdstanpy eventually had a create_inits for all the different inference methods. As was pointed out in the original issue, for some things like optimization this is equivalent to the existing stan_variables() method, but if there are multiple draws in the object we need to index through them.

martinmodrak commented 1 year ago

Note that the problem of distinguishing arrays of size 1 and scalars is probably relevant (https://github.com/stan-dev/posterior/issues/187). This however might be easier to resolve in cmdstanr, as here one can actually query the model to make that distinction...

The solution we implemented for the SBC package seems almost identical for the one in posterior (https://github.com/hyunjimoon/SBC/blob/8af581c79fbd52098051a537611cdcce4ea48a72/R/datasets.R#L266)

avehtari commented 1 year ago

The Birthdays example shows how to format a CmdStanR draw to init with a one-liner https://avehtari.github.io/casestudies/Birthdays/birthdays.html This could be used as basis for a function in CmdStanR