Open goedman opened 5 years ago
Will look into it today.
I think it is an issue with AD producing NaNs or similar. Ideally one would investigate, but I added a quick fix in the form of a wrapper, I made a PR in the repo.
Please review and let me know if this works.
Also, you can investigate corner cases of log densities with LogDensityProblems.stresstest
.
As I mentioned in the SR pull request comment, will try several more models and compare the results. I wonder if rejecting the cases where AD produces NaNs might influence the resulting densities. Maybe stress testing will help out here.
If that's ok, I'll try some more models before closing this issue.
That's fine, keep it open as long as you like and feel free to ask if there is anything I can help with.
Tamas, in trying to dig a bit deeper into the posterior draws from m10.4d.jl, I have tried to apply stresstest:
LogDensityProblems.stresstest(P, N=1000, scale=1.0)
0-element Array{Array{Float64,1},1}
P in this case uses:
P = TransformedLogDensity(problem_transformation(p), p)
#∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
∇P = ADgradient(:ForwardDiff, P);
Interestingly, interchanging the ∇P options in m12.6d.jl is spot on (and unaffected), which I interpret as some support for my earlier hypothesis that the rejection errors might create a bias.
Am I correct to assume that stresstest, if applied correctly, would show parameter values where NaNs occur? I was hoping to be able to plot those regions as part of the debugging proces.
You should try stresstest
with the ∇P
(plain vanilla ADgradient
, no LogDensityRejectErrors
). It should indeed return the vector of locations which are not OK (eg evaluate with NaN
s). Plotting is a good strategy.
As you suggest as well, I'd commented out the LogDensityRejectErrors line and replaced it with the vanilla ADgradient:
P = TransformedLogDensity(problem_transformation(p), p)
#∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
∇P = ADgradient(:ForwardDiff, P);
and it fails on the NUTS_init_tune_mcmc(∇P, 1000)
line.
I'll experiment a bit more.
Tamas, as I noticed (and tested) your updates to DynamicHMC and LogDensityProblems, can I ask for a little push to get stresstest() to work in trying to resolve the only model that continues to give different results in DynamicHMC vs. Turing & CmdStan?
In m10.4d1.jl (results illustrated in m10.4d1) I've tried to use stresstest() and different AD methods.
If I enable do_stresstest it will show:
julia> include("/Users/rob/.julia/dev/DynamicHMCModels/scripts/10/m10.04d1.jl")
MCMC, adapting ϵ (75 steps)
ERROR: Error while loading expression starting at /Users/rob/.julia/dev/DynamicHMCModels/scripts/10/m10.04d1.jl:87
caused by [exception 1]
InvalidLogDensityException: value is Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:β, :α),Tuple{TransformVariables.ArrayTransform{TransformVariables.Identity,1},TransformVariables.ArrayTransform{TransformVariables.Identity,1}}},m_10_04d_model{Array{Int64,1},Array{Int64,2},Array{Int64,1}}}},Float64}}(NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN)
Using LogDensityRejectErrors provides very stable results and the plots of the chains look reasonable (I've also tried 3 and 4 chains). It is true that the pooled estimate of a[2] is a bit higher in scale. But it is surprising this is the only model where I've seen a problem.
Particularly the sd values are vey high, in Turing, using NUTS, usually an indication that the adaptation draws are included but I don't think the posterior includes those in DynamicHMC.
If I understand correctly, there are two issues:
the script errors,
the results are different.
Did these happen after the recent (minor) updates to LogDensityProblems and DynamicHMC, or did you have them before?
Correct, w.r.t. issue 2), normally (for all other models) I use LogDensityRejectErrors(ADgradient(:ForwardDiff, P))
and the results are spot on.
W.r.t. issue 1), only for the m10.4d1.jl model I have experimented with stresstest by setting do_stresstest = true
. The last few weeks it fails on the ADgradient() call with above script error.
Are you using the latest LogDensityProblems and DynamicHMC?
I could not reproduce the errors above.
Also, I am not sure how I would detect the discrepancy, should I look at the plots?
In the script the results from Turing and CmdStan are included:
rethinking = "
mean sd 5.5% 94.5% n_eff Rhat
a[1] -0.74 0.27 -1.19 -0.31 2899 1
a[2] 10.77 5.20 4.60 20.45 1916 1
a[3] -1.05 0.28 -1.50 -0.62 3146 1
a[4] -1.05 0.28 -1.50 -0.61 3525 1
a[5] -0.73 0.28 -1.17 -0.28 3637 1
a[6] 0.22 0.27 -0.21 0.67 3496 1
a[7] 1.82 0.41 1.21 2.50 3202 1
bp 0.83 0.27 0.42 1.27 2070 1
bpC -0.13 0.31 -0.62 0.34 3430 1
";
I tried this morning with:
[[DynamicHMC]]
deps = ["ArgCheck", "DataStructures", "DocStringExtensions", "LinearAlgebra", "LogDensityProblems", "Parameters", "Random", "Statistics", "StatsFuns", "Test", "TransformVariables"]
git-tree-sha1 = "f9c9e6e703afccb02a1179e4886c97134ff20b55"
uuid = "bbc10e6e-7c05-544b-b16e-64fede858acb"
version = "1.0.4"
...
[[LogDensityProblems]]
deps = ["ArgCheck", "BenchmarkTools", "DiffResults", "DocStringExtensions", "Parameters", "Pkg", "Random", "Requires", "Test", "TransformVariables"]
git-tree-sha1 = "1ce5ebd16b5026a35cbf4daec66b2433b83d0d23"
uuid = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c"
version = "0.8.0"
in Manifest.toml.
The results I'm getting:
Log evidence = 0.0
Iterations = 1:3000
Thinning interval = 1
Chains = 1
Samples per chain = 3000
parameters = bp, bpC
Empirical Posterior Estimates
───────────────────────────────────────────
parameters
Mean SD Naive SE MCSE ESS
bp 1.5337 3.6815 0.0672 0.0886 1728.4017
bpC 0.4149 0.2460 0.0045 0.0038 3000.0000
Quantiles
───────────────────────────────────────────
parameters
2.5% 25.0% 50.0% 75.0% 97.5%
bp -10.7577 -1.0319 1.4902 4.0067 16.9960
bpC -0.4675 0.2435 0.4137 0.5796 1.3324
Log evidence = 0.0
Iterations = 1:3000
Thinning interval = 1
Chains = 1
Samples per chain = 3000
pooled = a[1], a[2], a[3], a[4], a[5], a[6], a[7]
Empirical Posterior Estimates
─────────────────────────────────────────────
pooled
Mean SD Naive SE MCSE ESS
a[1] -1.9824 3.6843 0.0673 0.0887 1727.1929
a[2] 9.8785 6.0428 0.1103 0.1973 937.6438
a[3] -2.2879 3.6866 0.0673 0.0899 1681.8172
a[4] -2.2809 3.6940 0.0674 0.0893 1712.3698
a[5] -1.9804 3.6927 0.0674 0.0896 1698.3607
a[6] -1.0489 3.6885 0.0673 0.0885 1736.5395
a[7] 0.5142 3.6860 0.0673 0.0908 1647.0961
Quantiles
─────────────────────────────────────────────
pooled
2.5% 25.0% 50.0% 75.0% 97.5%
a[1] -17.1055 -4.4397 -1.9079 0.5685 10.6230
a[2] -7.1898 5.7305 9.3678 13.3436 34.4317
a[3] -17.9862 -4.7485 -2.2723 0.2305 10.1737
a[4] -17.6815 -4.7411 -2.2326 0.2594 10.6519
a[5] -17.6976 -4.3925 -1.9362 0.5755 9.9957
a[6] -16.6376 -3.5379 -1.0127 1.5214 11.4556
a[7] -14.9577 -1.9840 0.5571 3.0386 12.6625
or, just printing the means of the posterior vector (from another run):
2-element Array{Array{Float64,1},1}:
[1.3940051607581527, 0.4210166348245524]
[-1.844585401458457, 10.05388176750645, -2.1519882289247705, -2.14490281190503, -1.8458711437273487, -0.9082955910437809, 0.6615688096440819]
Tamas, when you say "I could not reproduce the errors above." does that path (with do_stresstest=true
) work on your system? Or if you use the LogDensityRejectErrors path, do you see estimates like in my previous post?
Apologies for not getting back to you about this yet, this is a busy time for me.
I really appreciate this test case. Various other users have reported subtle (or not-so-subtle) maladaptation and bias issues. I have a couple of guesses about this:
To investigate 3 and 4, I will be working on tpapp/DynamicHMC.jl#30. I ask for your patience in the meantime. The test cases are greatly appreciated.
Thank you Tamas. Absolutely no apologies needed! I simply can't think of anyone contributing as much to the Julia community as you do!
Any time I notice an update to one the various packages I will continue to try it. For now I will always compare the results to at least Turing and Stan. When I started the work on StatisticalRethinking I had a time frame of about 2 years in mind so from that point of view I'm not even halfway.
Tamas, I was wondering if you would mind to take a look at m10.04d.jl?
Until chapter 10 in StatisticalRethinking.jl I've had no trouble adding the DynamicHMC versions in addition to Turing and CmdStan versions, e.g. m10.04t.jl and m10.04s.jl for. model 10.4.
I followed the approach I'd used successfully for m10.02d1.jl, which works fine, but in this case I get
Different starting values in
θ = (β = [1.0, 0.0], α = [-1.0, 10.0, -1.0, -1.0, -1.0, 0.0, 2.0])
do not solve the issue, thus I wonder if this way of modeling such problems is correct.Thanks for you help (if doable!), Rob