Open mboedigh opened 7 years ago
Seems like NUTS is getting stuck. Here is a reproducible script:
using Mamba
srand(123) ## set seed for reproducibility
sigma_w = 1;
N = 100;
x = (1:N);
line = Dict{Symbol, Any}(
:x => 1:N,
:y => randn(N)*sigma_w + x
)
line[:xmat] = [ones(length(line[:y]),1) line[:x] ];
model = Model(
y = Stochastic(1, (mu, sigma_w) -> MvNormal(mu, sigma_w), false),
mu = Logical(1, (xmat, beta) -> xmat * beta, false),
beta = Stochastic(1, (xmat) -> MvNormal(fill(0.0,size(xmat,2)), 5)),
sigma_w = Stochastic(() -> InverseGamma(0.001, 0.001))
)
scheme = [NUTS([:beta, :sigma_w])]
setsamplers!(model, scheme);
## Initial Values
inits = [
Dict{Symbol, Any}(
:y => line[:y],
:beta => rand(Normal(0, 1), 2),
:sigma_w => rand(Gamma(1, 1))
)
for i in 1:3
]
nchains = length(inits)
sim_nuts = mcmc(model, line, inits, 10000, burnin=2500, thin=1, chains=nchains)
describe(sim_nuts)
If we look at the MCMC values we see
julia> sim_nuts.value[1:10,:,:]
10×3×3 Array{Float64,3}:
[:, :, 1] =
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
[:, :, 2] =
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
[:, :, 3] =
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
I am guessing you were getting [-0.5, 0.5]
because you were initializing beta
to start at [0, 0]
.
Will look into the code. However, I am pretty sure this is mostly due to the nature of the example.
julia> X = line[:xmat]; y=line[:y];
julia> det(X'X)
8.3325e6
We see that the determinant of the X
matrix is very large meaning this problem is ill-conditioned.
Brain Thanks for the reply. I don’t know Julia well, but the problem I’m trying to create is not ill-conditioned. Line[:xmat] should be full column rank of 2. It solves well using:
julia> line[:xmat]\line[:y] 2-element Array{Float64,1}: 0.0733763 0.997099
I think the problem is not the data. I see that NUTS is getting stuck, I don’t know why. Stan solves this using it’s version of NUTS. If it is helpful, I can show you the Stan version I used.
I like the way Mamba is designed, and believe I will enjoy working with it regularly. Thanks.
Best regards Mike
From: Benjamin Deonovic [mailto:notifications@github.com] Sent: Monday, July 31, 2017 6:33 AM To: brian-j-smith/Mamba.jl Mamba.jl@noreply.github.com Cc: Boedigheimer, Michael mboedigh@amgen.com; Author author@noreply.github.com Subject: Re: [brian-j-smith/Mamba.jl] linear regression example not giving reasonable results (#120)
Seems like NUTS is getting stuck. Here is a reproducible script:
using Mamba
srand(123) ## set seed for reproducibility
sigma_w = 1;
N = 100;
x = (1:N);
line = Dict{Symbol, Any}(
:x => 1:N,
:y => randn(N)*sigma_w + x
)
line[:xmat] = [ones(length(line[:y]),1) line[:x] ];
model = Model(
y = Stochastic(1, (mu, sigma_w) -> MvNormal(mu, sigma_w), false),
mu = Logical(1, (xmat, beta) -> xmat * beta, false),
beta = Stochastic(1, (xmat) -> MvNormal(fill(0.0,size(xmat,2)), 5)),
sigma_w = Stochastic(() -> InverseGamma(0.001, 0.001))
)
scheme = [NUTS([:beta, :sigma_w])]
setsamplers!(model, scheme);
inits = [
Dict{Symbol, Any}(
:y => line[:y],
:beta => rand(Normal(0, 1), 2),
:sigma_w => rand(Gamma(1, 1))
)
for i in 1:3
]
nchains = length(inits)
sim_nuts = mcmc(model, line, inits, 10000, burnin=2500, thin=1, chains=nchains)
describe(sim_nuts)
If we look at the MCMC values we see
julia> sim_nuts.value[1:10,:,:]
10×3×3 Array{Float64,3}:
[:, :, 1] =
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
1.18304 0.376264 -0.405272
[:, :, 2] =
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
0.305749 1.33585 1.60076
[:, :, 3] =
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
0.610806 -1.45789 0.800589
Will look into the code. However, I am pretty sure this is mostly due to the nature of the example.
julia> X = line[:xmat]; y=line[:y];
julia> det(X'X)
8.3325e6
We see that the condition number of the X matrix is very large meaning this problem is ill-conditioned.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/brian-j-smith/Mamba.jl/issues/120#issuecomment-319068059, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AI8-D2l1npuKZ6Gxfc6KmoxVEYJObmq-ks5sTdeEgaJpZM4OglGD.
Changing initial values to something closer to the truth results in correct output:
## Initial Values
inits = [
Dict{Symbol, Any}(:y => line[:y], :beta => [0.0, 1.0], :sigma_w => 1.0),
Dict{Symbol, Any}(:y => line[:y], :beta => line[:xmat]\line[:y], :sigma_w => 1.0)
]
nchains = length(inits)
sim_nuts = mcmc(model, line, inits, 10000, burnin=2500, thin=1, chains=nchains)
describe(sim_nuts)
julia> describe(sim_nuts)
Iterations = 2501:10000
Thinning interval = 1
Chains = 1,2
Samples per chain = 7500
Empirical Posterior Estimates:
Mean SD Naive SE MCSE ESS
sigma_w 1.14096298 0.0820408647 0.000669860855 0.001462307980 3147.6234
beta[1] -0.09545879 0.2268879348 0.001852532230 0.005461201581 1726.0222
beta[2] 1.00263843 0.0039072747 0.000031902764 0.000084420935 2142.1386
Quantiles:
2.5% 25.0% 50.0% 75.0% 97.5%
sigma_w 0.99623051 1.08391122 1.13599791 1.191574016 1.31777059
beta[1] -0.54788610 -0.24284341 -0.09442113 0.057914062 0.34095642
beta[2] 0.99508585 0.99999733 1.00263428 1.005231688 1.01037139
After some digging it looks like the NUTS epsilon
value is not getting set correctly in some cases. I will investigate further.
Thanks, that make sense. Let me know if you figure out the potential issue with epsilon initialization during warmup. Best wishes Mike
From: Benjamin Deonovic [mailto:notifications@github.com] Sent: Monday, July 31, 2017 12:06 PM To: brian-j-smith/Mamba.jl Mamba.jl@noreply.github.com Cc: Boedigheimer, Michael mboedigh@amgen.com; Author author@noreply.github.com Subject: Re: [brian-j-smith/Mamba.jl] linear regression example not giving reasonable results (#120)
Changing initial values to something closer to the truth results in correct output:
inits = [
Dict{Symbol, Any}(:y => line[:y], :beta => [0.0, 1.0], :sigma_w => 1.0),
Dict{Symbol, Any}(:y => line[:y], :beta => line[:xmat]\line[:y], :sigma_w => 1.0)
]
nchains = length(inits)
sim_nuts = mcmc(model, line, inits, 10000, burnin=2500, thin=1, chains=nchains)
describe(sim_nuts)
julia> describe(sim_nuts)
Iterations = 2501:10000
Thinning interval = 1
Chains = 1,2
Samples per chain = 7500
Empirical Posterior Estimates:
Mean SD Naive SE MCSE ESS
sigma_w 1.14096298 0.0820408647 0.000669860855 0.001462307980 3147.6234
beta[1] -0.09545879 0.2268879348 0.001852532230 0.005461201581 1726.0222
beta[2] 1.00263843 0.0039072747 0.000031902764 0.000084420935 2142.1386
Quantiles:
2.5% 25.0% 50.0% 75.0% 97.5%
sigma_w 0.99623051 1.08391122 1.13599791 1.191574016 1.31777059
beta[1] -0.54788610 -0.24284341 -0.09442113 0.057914062 0.34095642
beta[2] 0.99508585 0.99999733 1.00263428 1.005231688 1.01037139
After some digging it looks like the NUTS epsilon value is not getting set correctly in some cases. I will investigate further.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/brian-j-smith/Mamba.jl/issues/120#issuecomment-319160783, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AI8-D1-VwETbTSydQdLY92L2GshI3dMCks5sTiFkgaJpZM4OglGD.
Sorry I'm posting here, but I did see others post other requests for help here. Let me know if that is bad form. I'm trying to learn Mamba, starting with my experience from Stan. To get started I used the linear regression model at the start of the tutorial. I made minimal modifications towards a more general model. These are shown in snippets below
The model was modified slightly from the example (again a step toward generalization):
the results show estimates for beta of about [-0.5 and 0.5], which is not so good given the least-squares answer of :
Can anyone suggest what is going wrong? I used a very similar model in Stan and it gave a reasonable answer.