Closed spinkney closed 4 years ago
Ok, let me start a branch later today and set everything up and we can work on it. I will give you push permissions.
Had some other stan release matters to deal with yesterday, will get on this today.
So, I have a version that compiles now at https://github.com/rok-cesnovar/math/tree/solve_quadprog Its ultimately wrong most likely as I just brute-forced all vars to doubles, so the gradient w.r.t. to G will be wrong. But its a start.
I will make a cmdstan branch with the added signature for the function tomorrow morning CET time. If you have some (Fake) data we could test this with that would also be great.
Great, thanks! I'm up early EST and I can test with some fake data.
Clone with git clone --branch solve_quadprog https://github.com/stan-dev/cmdstan.git --recursive
One minor typo that caused me some consternation, haha, in solve_quadprog
you have solve_quadpORg.
Oh, sorry :)
@spinkney Oh, nevermind, I forgot to append the correct stan compiler version. Just a second.
If you already cloned, just do a git pull
on cmdstan. No need to update the submodules. Your model should then compile but I am expecting wrong values for at least the G parameter matrix.
If I remove the functions block, assuming that it's in the stan/math folder, I get
scratch that...I did a make clean-all and rebuild. The quadprog_example_two compiles
well, it worked!
As in "doesnt crash" or "actually gives seemingly correct result"? I would be surprised if the latter.
Yes, it doesn't crash but it doesn't give the right result because it has to be an unconstrained variable.
I want a lower=0 constraint on the variable that gets passed to G
as in.
parameters {
real<lower=0> sigma;
vector<lower=0>[K] v_weights;
}
transformed parameters {
row_vector[J] weight;
vector[K] X_weight;
matrix[K, K] G = diag_matrix(2 * v_weights/sum(v_weights) );
X_weight = solve_quadprog(G, g0, CE, ce0, CI, ci0);
}
But I get the following error:
I am not sure I completely follow. You want the constraing on v_weights?
Yes, I didn't have the constraint in the post above--updated. Also do the updates allow g0
to have parameters too?
Yes, I didn't have the constraint in the post above--updated.
Ok, let me see why that would be problem. Do you have any data (or subsets) you can share?
Also do the updates allow
g0
to have parameters too?
Not currently. I am now trying to rework this so that any of the arguments can be parameters. Will push the changes once I finish.
So far I'm not convinced this simulation creates my situation but it gives you an idea
set.seed(12345)
K <- 8
J <- 3
true_v <- c(0.2, 0.5, 0.3, rep(0, K - 3))
true_w <- c(0.7, 0.3, rep(0, J - 2))
X0 <- matrix(rnorm(K * J, 0, sd = 1), J, K)
X1 <- apply(X0, 2, function(x) t(true_w) %*% (x ))
X1 <- true_v * X1 + rnorm(K, 0, 0.001)
Y0 <- rnorm(J, 10, sd = 3)
Y1 <- true_w %*% Y0 + rnorm(1)
stan_data <- list(K = K,
J = J,
X1 = X1,
X0 = X0,
CE = matrix(1, nrow = K, ncol = 1),
ce0 = sum(X1),
CI = cbind(diag(K), -diag(K)),
ci0 = c(rep(0.0, K), -X1))
K = K
J = J
X1 = X1
X0 = X0
Y0 = Y0
Y1 = as.vector(Y1)
CE = matrix(1, nrow = K, ncol = 1)
ce0 = array(sum(X1))
CI = cbind(diag(K), -diag(K))
ci0 = c(rep(0.0, K), -X1)
stan_rdump(c('K', 'J', 'X1', 'X0','Y0', 'Y1', 'CE',
'ce0', 'CI', 'ci0'), file = "quadprog_data.R")
data {
int<lower=1> K; // num predictors
int<lower=0> J;
vector[K] X1;
matrix[J, K] X0;
vector[J] Y0;
real Y1;
matrix[K, 1] CE;
vector[1] ce0;
matrix[K, 2 * K] CI;
vector[2 * K] ci0;
}
transformed data {
vector[K] g0 = -2 * X1;
}
parameters {
real<lower=0> sigma;
vector[K] v_weights;
}
transformed parameters {
row_vector[J] weight;
vector[K] X_weight;
matrix[K, K] G = diag_matrix(2 * v_weights/sum(v_weights) );
X_weight = solve_quadprog(G, g0, CE, ce0, CI, ci0);
}
model {
sigma ~ normal(0.01, 0.05);
X1 ~ normal(X_weight, sigma);
}
Thanks! Will update once I dig through.
You can use this to test to get parameter values working and what not. But it's not going to return the simulated values even if you get that to work. The issue is this program will just set X0W = X1 since it doesn't know anything about X0 yet and there's no constraint on the weights. I'll work on updating that.
Ok, the function now accepts any of the 6 args as a parameter. And I am fairly certain the gradients are now working fine.
I ran the model with
vector<lower=0>[K] v_weights;
and the sampling finished. There were however a lot of divergences:
975 of 1000 (98%) transitions ended with a divergence.
Run git pull
on cmdstan and make stan-update
to update the submodules.
Great, I will try to test this weekend.
I did find the issue with the code and I have it partially rewritten. I've also tested with quadprogpp
in R and I'm able to recover the weights when I set v_weights to a uniform simplex (though this is the simple case that is not needed for Stan).
Will test on the simple case with the stan model and then try the harder case with estimating v_weights. In the rewritten case, both G
and g0
will be parameters.
Thanks for your quick work on this! I must apologize that I may need a few days to test since I have two small kids and this shelter-at-home has made me a part-time preschool teacher and stay-at-home dad.
I completely understand. No rush on my side.
Let me know if it works when you test or if we need to dig a bit deeper. I will add some more tests on the C++ level. Make sure to pull the latest version and update the submodules.
My rewritten model works and the tests I've done all show this to work out.
G
and g0
are parameters and I tested v_weights
as a simplex and positive_ordered vector, both work. It's looking pretty good too
name Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -19.00 0.13 2.18 -22.92 -18.66 -16.00 282.54 499.19 1.00
accept_stat__ 0.88 0.00 0.13 0.58 0.92 1.00 791.45 1398.33 1.00
stepsize__ 0.40 nan 0.00 0.40 0.40 0.40 nan nan nan
treedepth__ 3.06 0.01 0.35 3.00 3.00 4.00 850.11 1501.96 1.00
n_leapfrog__ 8.95 0.12 3.53 7.00 7.00 15.00 926.40 1636.76 1.00
divergent__ 0.00 nan 0.00 0.00 0.00 0.00 nan nan nan
energy__ 23.43 0.18 2.92 19.32 23.06 29.03 273.05 482.43 1.00
sigma 0.85 0.02 0.45 0.32 0.75 1.72 836.39 1477.71 1.00
v_weights[1] 0.16 0.01 0.17 0.01 0.11 0.48 1005.77 1776.99 1.00
v_weights[2] 0.35 0.01 0.25 0.06 0.29 0.85 1020.09 1802.28 1.00
v_weights[3] 0.59 0.01 0.34 0.15 0.54 1.25 962.97 1701.36 1.00
v_weights[4] 0.90 0.01 0.43 0.34 0.83 1.71 1159.94 2049.37 1.00
v_weights[5] 1.32 0.02 0.60 0.53 1.23 2.39 1076.27 1901.54 1.01
v_weights[6] 2.09 0.03 0.88 0.95 1.96 3.64 1162.22 2053.39 1.00
v_weights[7] 3.44 0.05 1.48 1.53 3.15 6.36 980.75 1732.78 1.00
v_weights[8] 15.12 0.12 3.86 9.35 14.73 21.70 1116.63 1972.85 1.00
w_optim[1] 0.61 0.00 0.02 0.57 0.62 0.64 597.79 1056.16 1.00
w_optim[2] 0.38 0.00 0.02 0.35 0.38 0.43 645.98 1141.30 1.00
w_optim[3] 0.00 0.00 0.01 0.00 0.00 0.02 943.81 1667.50 1.00
That is great news!
Do you think this is something that can be pushed into Stan for the next release?
That is going to be difficult for two reasons. The largest being the licensing. The quadprog++ code from which this work was derived is GPL licensed and that license (as far as I know) does not play well with the BSD license of cmdstan/stan math.
The other reason is that the solve_quadprog function as it's written in the original and in our modification may be performant but the style is pretty "dangerous", especially due to the goto statements. But this isn't as large of an issue, or at least one we can work around by throwing out the gotos.
The solution for the first issue would be to write the solve_quadprog C++ code from scratch (or derived from BSD licensed R/Python/... code) or use a BSD licensed C++ code. Does any of that exist?
Now that this seems to work ok, I am going to add that to the original example (using external C++ file). That should now also work and we can use this without having to clone a fork of math and use a special version of the compiler. So much more accessible. We can then make a post if you agree.
Cool sounds good. I would love for the work you and Charles have done to be included as well as thinking that this opens up a lot modeling possibilities. Not just the algebra solver stuff that Charles had in mind. Like the possibility of adding generalized method of moments to Stan (something like https://web.hku.hk/~gyin/materials/2009YinBA.pdf) and the nested optimization stuff I'm doing.
What about Eclipse Public License? I found Ipopt C++ code at https://github.com/ampl/coin/tree/master/Ipopt/Ipopt
I don't think the Eclipse license works with maintaining BSD. However, I did find a python library under BSD license that has a number of solvers including quadratic. These could probably be ported to C++ pretty easily.
Nice! That seems promising.
Continuing this from the discussion on the Stan forums at https://discourse.mc-stan.org/t/quadratic-optimizier/4405/47.
We want to merge the
quadprog
code from the test quadprog repo and update thesolve_quadprog
C++ template to work with stan::math::var. It currently uses Eigen::MatrixXd everywhere and we need to make sure those accept vars.