Closed cwliu007 closed 10 years ago
Mamba's results for the Rasch model (as defined in the Wikipedia page and the ltm
R package documentation) are consistent with those from JAGS. See comparisons below.
RES = readdlm("res.txt")
people, itemnum = size(RES)
## A data dictionary
data = (Symbol => Any)[
:RES => RES,
:itemnum => itemnum,
:point => 2,
:C => 1,
:z => [0;1],
:people => people]
chains = 3
# # initial values: A vector (n elements) consisting of n dictionary
inits = Dict{Symbol,Any}[
[:RES => data[:RES],
:theta => zeros(data[:people]),
:alpha => abs(randn()),
:delta => rand(data[:itemnum])]
for i in 1:chains]
# # construct model
model = Model(
RES = Stochastic(2,
@modelexpr(theta, delta, people, itemnum,
Distribution[
begin
p = invlogit(theta[j] - delta[i])
Categorical([1 - p, p])
end
for j = 1:people, i = 1:itemnum
]
),
false
),
delta = Stochastic(1,
:(Normal(0, 10))
),
theta = Stochastic(1,
:(Normal(0, 1)),
false
)
)
# # # Choose Samplers
scheme1 = [Slice([:delta], fill(1.0, data[:itemnum])),
Slice([:theta], fill(1.0, data[:people]))]
# # # Set samplers
setsamplers!(model, scheme1)
# # # Start sampling
sim1 = mcmc(model, data, inits, 11000, burnin=1000, thin=2, chains=chains)
describe(sim1)
Iterations = 1002:11000
Thinning interval = 2
Chains = 1,2,3
Samples per chain = 5000
Empirical Posterior Estimates:
11x6 Array{Any,2}:
"" "Mean" "SD" "Naive SE" "MCSE" "ESS"
"delta[1]" -2.36876 0.501095 0.00409142 0.0136901 4482.91
"delta[2]" -1.5289 0.396163 0.00323465 0.00953147 5090.49
"delta[3]" -1.14127 0.377432 0.00308172 0.00880372 5250.71
"delta[4]" -0.700101 0.352429 0.00287757 0.00845189 5106.98
"delta[5]" 0.104506 0.341061 0.00278476 0.00797456 5238.07
"delta[6]" 0.300763 0.341886 0.00279149 0.00779137 5374.19
"delta[7]" 1.02646 0.362721 0.00296161 0.00913744 4861.76
"delta[8]" 1.39177 0.390375 0.0031874 0.00881855 5421.64
"delta[9]" 1.39241 0.38263 0.00312416 0.00787714 5949.16
"delta[10]" 2.39328 0.481939 0.00393502 0.0120364 4903.89
Quantiles:
11x6 Array{Any,2}:
"" "2.5%" "25.0%" "50.0%" "75.0%" "97.5%"
"delta[1]" -3.43836 -2.68509 -2.34409 -2.02045 -1.46896
"delta[2]" -2.32807 -1.78598 -1.5213 -1.25728 -0.764239
"delta[3]" -1.87813 -1.38866 -1.13533 -0.886678 -0.415053
"delta[4]" -1.39712 -0.936303 -0.696955 -0.459057 -0.0275823
"delta[5]" -0.571883 -0.123607 0.103981 0.335361 0.771964
"delta[6]" -0.376678 0.0732802 0.300406 0.531212 0.97529
"delta[7]" 0.328957 0.778934 1.02131 1.26588 1.75122
"delta[8]" 0.636813 1.1259 1.39047 1.6503 2.16883
"delta[9]" 0.631384 1.13732 1.39059 1.64732 2.14126
"delta[10]" 1.51803 2.06071 2.36765 2.70424 3.3984
model {
for(i in 1:50) {
for(j in 1:10) {
p[i,j] <- ilogit(theta[i] - delta[j])
prob[i,j,1] <- 1 - p[i,j]
prob[i,j,2] <- p[i,j]
RES[i,j] ~ dcat(prob[i,j,])
}
}
for(j in 1:10) {
delta[j] ~ dnorm(0, 0.01)
}
for(i in 1:50) {
theta[i] ~ dnorm(0, 1)
}
}
Iterations = 1002:11000
Thinning interval = 2
Number of chains = 3
Sample size per chain = 5000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
delta[1] -2.3767 0.4894 0.003996 0.004602
delta[2] -1.5285 0.4006 0.003271 0.004170
delta[3] -1.1399 0.3778 0.003085 0.004041
delta[4] -0.6913 0.3512 0.002868 0.003916
delta[5] 0.1003 0.3386 0.002765 0.003660
delta[6] 0.2960 0.3452 0.002818 0.003755
delta[7] 1.0354 0.3651 0.002981 0.003878
delta[8] 1.4017 0.3936 0.003214 0.004124
delta[9] 1.3975 0.3896 0.003181 0.004135
delta[10] 2.3853 0.4939 0.004033 0.004640
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
delta[1] -3.3922 -2.69367 -2.35982 -2.0384 -1.47578
delta[2] -2.3403 -1.78990 -1.52023 -1.2567 -0.77073
delta[3] -1.9022 -1.39020 -1.13663 -0.8798 -0.42476
delta[4] -1.3860 -0.92679 -0.68959 -0.4532 -0.01466
delta[5] -0.5769 -0.12618 0.09974 0.3289 0.75977
delta[6] -0.3799 0.06659 0.29358 0.5271 0.97410
delta[7] 0.3252 0.78756 1.02985 1.2800 1.76700
delta[8] 0.6604 1.13250 1.39128 1.6619 2.18829
delta[9] 0.6526 1.13394 1.38438 1.6529 2.18805
delta[10] 1.4653 2.04785 2.36419 2.7055 3.39592
Note that alpha
, theta
, and delta
are not uniquely identifiable in the likelihood for this parametrization; i.e., given any set of values for those parameters, c * alpha
, theta / c
, and delta / c
, where c
is any scalar, gives the same likelihood value. Identifiability comes from the priors. Thus, depending on the choice of priors, posterior estimates may or may not be consistent with the values used to simulate the data. The composite probability p
is identifiable in the likelihood.
# # construct model
model = Model(
RES = Stochastic(2,
@modelexpr(alpha, theta, delta, people, itemnum,
Distribution[
begin
p = invlogit(alpha * (theta[j] - delta[i]))
Categorical([1 - p, p])
end
for j = 1:people, i = 1:itemnum
]
),
false
),
alpha = Stochastic(
:(Uniform(0, 10))
),
delta = Stochastic(1,
:(Normal(0, 10))
),
theta = Stochastic(1,
:(Normal(0, 1)),
false
)
)
# # # Choose Samplers
scheme1 = [Slice([:alpha], [1.0]),
Slice([:delta], fill(1.0, data[:itemnum])),
Slice([:theta], fill(1.0, data[:people]))]
Iterations = 1002:11000
Thinning interval = 2
Chains = 1,2,3
Samples per chain = 5000
Empirical Posterior Estimates:
12x6 Array{Any,2}:
"" "Mean" "SD" "Naive SE" "MCSE" "ESS"
"delta[1]" -10.071 4.72646 0.0385914 0.202627 2856.84
"delta[2]" -6.8526 3.75743 0.0306793 0.177871 2587.21
"delta[3]" -5.03347 2.87318 0.0234594 0.113126 3110.6
"delta[4]" -3.05824 2.18216 0.0178172 0.0963987 2772.43
"delta[5]" 0.376126 1.64503 0.0134316 0.057892 3480.17
"delta[6]" 1.42644 1.83966 0.0150207 0.0706917 3187.24
"delta[7]" 4.50767 2.61753 0.021372 0.116421 2753.63
"delta[8]" 6.10583 3.18698 0.0260216 0.140568 2776.75
"delta[9]" 5.98729 3.24141 0.026466 0.146045 2718.27
"delta[10]" 10.265 4.66644 0.0381014 0.220896 2587.28
"alpha" 0.245565 0.153358 0.00125216 0.00814715 2305.4
Quantiles:
12x6 Array{Any,2}:
"" "2.5%" "25.0%" "50.0%" "75.0%" "97.5%"
"delta[1]" -20.9111 -12.9792 -9.51878 -6.42177 -3.06046
"delta[2]" -16.2543 -8.91476 -6.27715 -3.98606 -1.7269
"delta[3]" -11.8642 -6.75847 -4.47639 -2.82186 -1.08357
"delta[4]" -8.5313 -4.26672 -2.62158 -1.45426 0.00998723
"delta[5]" -3.0639 -0.521997 0.328202 1.26401 3.91983
"delta[6]" -1.74056 0.273591 1.1458 2.45209 5.58317
"delta[7]" 0.784502 2.44709 4.10545 6.11024 10.6612
"delta[8]" 1.49777 3.50621 5.69972 8.22152 13.1567
"delta[9]" 1.57993 3.41186 5.50294 7.82984 14.0761
"delta[10]" 3.11491 6.49361 10.0802 13.5985 19.4249
"alpha" 0.0928291 0.139705 0.187951 0.300781 0.665054
model {
for(i in 1:50) {
for(j in 1:10) {
p[i,j] <- ilogit(alpha * (theta[i] - delta[j]))
prob[i,j,1] <- 1 - p[i,j]
prob[i,j,2] <- p[i,j]
RES[i,j] ~ dcat(prob[i,j,])
}
}
alpha ~ dunif(0, 10)
for(j in 1:10) {
delta[j] ~ dnorm(0, 0.01)
}
for(i in 1:50) {
theta[i] ~ dnorm(0, 1)
}
}
Iterations = 1002:11000
Thinning interval = 2
Number of chains = 3
Sample size per chain = 5000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
alpha 0.2293 0.1663 0.001358 0.01333
delta[1] -11.2953 5.1934 0.042404 0.26320
delta[2] -7.3468 3.7278 0.030438 0.17535
delta[3] -5.5002 3.0548 0.024942 0.12367
delta[4] -3.3735 2.3533 0.019215 0.07378
delta[5] 0.4524 1.8484 0.015092 0.01710
delta[6] 1.4039 1.9274 0.015737 0.03252
delta[7] 4.9500 2.8462 0.023239 0.11640
delta[8] 6.6778 3.4938 0.028527 0.15737
delta[9] 6.6618 3.4673 0.028310 0.14971
delta[10] 11.2441 5.1340 0.041919 0.25536
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
alpha 0.08628 0.1272 0.1678 0.2547 0.739847
delta[1] -22.26484 -14.7251 -11.0654 -7.3997 -2.825435
delta[2] -15.68969 -9.6366 -7.0346 -4.4935 -1.686619
delta[3] -12.48723 -7.3535 -5.0872 -3.1171 -1.100239
delta[4] -8.94049 -4.7154 -2.9868 -1.6091 -0.002983
delta[5] -3.27275 -0.5441 0.3552 1.4224 4.498693
delta[6] -1.99642 0.2072 1.1306 2.4310 5.919822
delta[7] 0.87206 2.7748 4.5434 6.6327 11.409588
delta[8] 1.51239 3.9504 6.2858 8.8746 14.374147
delta[9] 1.46459 4.0173 6.3293 8.8118 14.295844
delta[10] 2.79915 7.3708 11.0674 14.5969 22.028411
As for the interfaces... Differences between Mamba and JAGS are touched on in the Introduction section of the documentation. JAGS provides a DSL (domain specific language) for model specification; Mamba uses native julia code. Advantages of a DSL include more concise syntax and facilitation of automatic sampler selection. Disadvantages include considerably more work to implement and extend, and less flexible model specification.
Thanks for your clear example of code, Brian. That helps me to understand your package more :+1:
No problem. Best of luck with your project.
Hi Brian,
1) According to the
lsat
example, I wrote a Rasch model (a model revelant to the modellsat example
used, but not the same) in Mamba's script below in the last. After a serial of trial and error (I will explain later), I made it a go. However, the results are not right! I know it is not right because I simulated the data set and have estimated the parameters by using an R package called `ltm' whose results are very close to the "true" parameter I set up. Additionally, the results were also confirmed by using Rstan. But I am not sure if I wrote the Mamba script mistakenly or it is an issue in Mamba itself, here I think I should tell you about this. I also tried other samplers and long update, but seems not working.See introduction of Rasch model just in case if need: http://en.wikipedia.org/wiki/Rasch_model.
The true parameters for delta range from -2 to 2 in step of 2, like
linspace(-2,2,10)
in Julia. alphas are all 1.2) Inside the script, the
d = Logical(1,
part is not necessary and preserved just for testing. However, I found the results ofdelta
are not averagely equal to 0. No idea why the average ofdelta
is not zero. Another question (not fordelta
), can it be done for a matrix, constraints for each column respectively.3) At first use, I wrote the
delta
as simplyd
, and I use bothd
in Stochastic and Logical part. Mamba seems not accepting it and return unknown error. Maybe in the manual the naming rule should be noted clearly.Here comes my feelings about Mamba: 4) The Model part is not very straightforward to quickly write for the first use. For example, I have been confused why we can use the
theta
insidetheta=Stochastic
part? Because withinquote...end
? A metaprogramming (but, I am not familiar with it)? And what does it mean when we write in this way? What is the logic? Maybe I am too used to the styles that JAGS (just write down the likelihood function in Model part!!). Overall, I personally think it is not very friendly to use. But sad I am not an expert here, and cannot give you any suggestions. Just a quick experience with Mamba.Script here! Please download the data set from http://www.fileswap.com/dl/xlYHvEv6Fr/. Sorry to bother, because I cannot find your email.
Oh! The model below is an extension of Rasch model. Ignoring the alpha parameters, the model below is equivalent to Rasch model.! Just forget to delete alpha here. But, they all should be estimated.