Allow specification of inverse mass matrix (Feature Request)

aaronjg commented 7 years ago

Summary:

@mitzimorris recently added the specification of the initial inverse matrix (https://github.com/stan-dev/stan/pull/2260). It would be nice to have this feature in rstan.

Description:

It could be added by adding an additional argument (init_inv_metric) to the sampling command. When metric is "diag_e" then init_inv_metric is a vector, or list of vectors, one for each chain, with length equal to the number of unconstrained params. When metric is "dense_e" then init_inv_metric is a semi-positive definite matrix of the appropriate size. When metric is "unit_e" and init_inv_metric is specified, then an error is thrown.

Alternatively, for "diag_e" the inverse metric could be specified on a per parameter basis, similar to how "inits" is used to set initial values.

I could take a stab at implementing this if the proposal seems reasonable, and no one else is already working on it.

Current Version:

v2.16.2

bgoodri commented 7 years ago

I think this is something that we are going to have to decide how the interfaces collectively want to go about implementing this.

On Sat, Jul 15, 2017 at 2:10 AM, aaronjg notifications@github.com wrote:

Summary:

@mitzimorris https://github.com/mitzimorris recently added the specification of the initial inverse matrix (stan-dev/stan#2260 https://github.com/stan-dev/stan/pull/2260). It would be nice to have this feature in rstan. Description:

It could be added by adding an additional argument (init_inv_metric) to the sampling command. When metric is "diag_e" then init_inv_metric is a vector, or list of vectors, one for each chain, with length equal to the number of unconstrained params. When metric is "dense_e" then init_inv_metric is a semi-positive definite matrix of the appropriate size. When metric is "unit_e" and init_inv_metric is specified, then an error is thrown.

Alternatively, for "diag_e" the inverse metric could be specified on a per parameter basis, similar to how "inits" is used to set initial values.

I could take a stab at implementing this if the proposal seems reasonable, and no one else is already working on it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstan/issues/433, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqkoeEXiYLY_k9gy-DxSDLrK1ojhsks5sOFe5gaJpZM4OY7h7 .

bob-carpenter commented 7 years ago

Now that it's implemented at the C++ level and has consistent behavior through the commands in stan-dev/stan/services, I think the interfaces can have at it. I think this is similar to the include mechanism. We'll have a do-over come Stan 3, I think, though I'd still like to keep backward compatibility to the extent possible.

The bigger issue in my mind is how to round-trip the output back to the input easily so that you can checkpoint and restart. No good way to restart with same RNG, though, as the state of the RNG is longer than the seed. So we'll need some methdology on top of this.

There should also be an implementation in CmdStan and PyStan.

mitzimorris commented 7 years ago

regarding round-trip, at present, the sampler outputs the mass matrix as a comment along with the report of the step size and all sampler settings.

the following is the entire sampler output file produced by running the Stan model stan/src/test/test-models/good/mcmc/hmc/common/guass3D.stan using the command stan interface:

# stan_version_major = 2
# stan_version_minor = 15
# stan_version_patch = 0
# model = gauss3D_model
# method = sample (Default)
#   sample
#     num_samples = 10
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.050000000000000003 (Default)
#       delta = 0.80000000000000004 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
# id = 0 (Default)
# data
#   file =  (Default)
# init = 2 (Default)
# random
#   seed = 0
# output
#   file = output.csv (Default)
#   diagnostic_file =  (Default)
#   refresh = 100 (Default)
lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,x.1,x.2,x.3
# Adaptation terminated
# Step size = 0.903198
# Diagonal elements of inverse mass matrix:
# 0.87276, 0.838523, 0.978681
-6.10519,0.636153,0.903198,2,3,0,9.10688,-2.25447,-2.3002,-1.35529
-1.48807,0.981407,0.903198,2,7,0,5.86434,0.803299,1.35695,0.699681
-0.963159,1,0.903198,2,3,0,1.70728,-0.783413,-1.09175,-0.347375
-1.00045,0.747388,0.903198,2,7,0,3.95563,1.15219,0.379974,-0.727308
-0.0487822,0.9825,0.903198,1,3,0,1.07166,0.217499,0.0915981,-0.204618
-0.461112,0.93939,0.903198,2,3,0,0.585091,0.952014,0.0434435,-0.11835
-0.252746,0.990471,0.903198,2,3,0,0.746786,-0.597999,0.112933,0.367608
-0.605157,0.858803,0.903198,2,3,0,1.68352,0.983719,0.486868,0.0746432
-7.11475,0.310952,0.903198,2,3,0,10.1118,3.07083,-1.56653,-1.5315
-0.0446853,1,0.903198,2,3,0,6.24669,0.291777,-0.0232992,-0.0607754
# 
#  Elapsed Time: 0.153384 seconds (Warm-up)
#                0.001156 seconds (Sampling)
#                0.15454 seconds (Total)
#

aaronjg commented 7 years ago

I started to take a stab at getting this to work, but ended up running into issues at the stan var_context interface, where it seems like the data is not being passed in or checked, since even a mass matrix of incorrect dimension does not throw an error.

Current very half finished work is here https://github.com/aaronjg/rstan/tree/inverse_metric

If someone with more rstan experience can give me some pointers, I'll pick this up again.

Here is the test code I was using:

schools_dat <- list(J = 8L, 
                    y = c(28,  8, -3,  7, -1,  1, 18, 12),
                    sigma = c(15, 10, 16, 11,  9, 11, 10, 18))
out.data <- schools_dat

detach('package:rstan'); library(rstan)
ctrl <- list(init_inv_metric=list(inv_metric=rep(1,9)))
fit <- stan(file = '8schools.stan', data = schools_dat, init=list(list(theta=1)),
            iter = 1000, chains = 1,control=ctrl)

syclik commented 7 years ago

@aaronjg, can you point out exactly where you're having trouble? I implemented something simple for @avehtari in CmdStan: https://github.com/stan-dev/cmdstan/compare/feature/mass-matrix-only#diff-325e1556bfe9173b00e1cd1fd8c77f9eR184

mitzimorris commented 7 years ago

the unit tests for this feature show how to use the var_context interface to create the mass_matrix. e.g.: https://github.com/stan-dev/stan/blob/develop/src/test/unit/services/sample/hmc_nuts_dense_inv_metric_test.cpp

aaronjg commented 7 years ago

@syclik Unfortunately, I'm not sure where the problem is. There is a lot of stuff with the var_context that is new to me. It seem like the data isn't getting passed in properly since the sampler just runs and outputs all 0s (I'm guessing the metric got set to 0 somehow). I've tried attaching a debugger, and it looked like the var_context wasn't getting passed in properly from R, but I'm not sure...

I'll take a look at the unit tests - I was just looking at the diag metric now, so it seems like I just need to pass a vector in.

aaronjg commented 7 years ago

@syclik I suspect I'm doing something wrong in the stan_fit.hpp here https://github.com/aaronjg/rstan/commit/f2979053ec79e9b766a866b841d33fd47ab1ccd2

But I was following the logic how inits were passed in, so not sure what is going wrong...

maverickg commented 7 years ago

I could try to find time to address this. @aaronjg can you do me a favor by providing an example using cmdstan?

aaronjg commented 7 years ago

@maverickg I don't think this is implemented in CMDStan yet https://github.com/stan-dev/cmdstan/issues/563

though @syclik has started working on it https://github.com/stan-dev/cmdstan/compare/feature/mass-matrix-only#diff-325e1556bfe9173b00e1cd1fd8c77f9eR184

ahartikainen commented 5 years ago

Hi, any progress on this?

I implemented this for PyStan (see https://github.com/stan-dev/pystan/pull/611 )

I also updated the stepsize behavior, so it could be defined for each chain.

SteveBronder commented 4 years ago

Making a little bump for this!

bbbales2 commented 4 years ago

You can do this in cmdstanr with the inv_metric argument to the sample function (https://mc-stan.org/cmdstanr/reference/model-method-sample.html)

stan-dev / rstan