Closed lizzieinvancouver closed 3 years ago
@FaithJones @legault
priors land diagnostics look okay, but another eye on the priors would be great!
sigma_interceptsbf ~ normal(0, 1); // model estimate: 0.48
lam_interceptsbf ~ beta(2, 2); // model estimate: 0.60
b_zf ~ normal(0, 5); // model estimate: -5.15
sigma_interceptsbc ~ normal(0, 1); // model estimate: 1.2
lam_interceptsbc ~ beta(2, 2); // model estimate: 0.31
b_zc ~ normal(0, 5); // model estimate: -5.95
sigma_interceptsbp ~ normal(0, 1); // model estimate: 0.21
lam_interceptsbp ~ beta(2, 2); // model estimate: 0.44
b_zp ~ normal(0, 5); // model estimate: -1.43
sigma_interceptsa ~ normal(0, 1); // model estimate: 1.5
lam_interceptsa ~ beta(2, 2); // model estimate: 0.35
a_z ~ normal(20, 10); // model estimate: 28.7
sigma_y ~ normal(0, 10) // model estimate: 13
check_all_diagnostics(fit)
[1] "n_eff / iter looks reasonable for all parameters"
[1] "Rhat looks reasonable for all parameters"
[1] "32 of 4000 iterations ended with a divergence (0.8%)"
[1] " Try running with larger adapt_delta to remove the divergences"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-BFMI indicated no pathological behavior"
The pairs plot still doesn't show anything! Any ideas on what to look at would help. https://github.com/lizzieinvancouver/pmm/blob/master/analyses/figures/ospreepairs.pdf
Also, do either of you have nice code to make pairs plots? This is how I did it and it felt clunky:
pdf(file="figures/ospreepairs.pdf")
pairs(~sigma_y + lam_interceptsa + sigma_interceptsa + lam_interceptsbf + sigma_interceptsbf +
lam_interceptsbc + sigma_interceptsbc + lam_interceptsbp + sigma_interceptsbp +
b_zf + b_zc + b_zp + a_z + lp__,
data=rbind(nondiv_params[pars], div_params[pars]), pch=21, cex=0.8, bg=colz)
dev.off()
Here's the model estimates @legault @FaithJones @MoralesCastilla
> sumfit[which(rownames(sumfit) %in% pars),]
mean se_mean sd 2.5% 25% 50% 75% 97.5%
sigma_y 1.309226e+01 0.003220159 0.19341720 1.271869e+01 1.295783e+01 1.309134e+01 1.322328e+01 1.347630e+01
lam_interceptsa 3.491009e-01 0.002017819 0.09399209 1.741771e-01 2.841903e-01 3.469415e-01 4.118906e-01 5.392201e-01
sigma_interceptsa 1.465400e+00 0.002187387 0.10660082 1.274079e+00 1.391966e+00 1.458352e+00 1.534572e+00 1.691753e+00
lam_interceptsbf 6.006408e-01 0.009704927 0.17595572 2.346146e-01 4.797236e-01 6.129569e-01 7.285988e-01 9.134766e-01
sigma_interceptsbf 4.833773e-01 0.005585958 0.08372441 3.406178e-01 4.234401e-01 4.783556e-01 5.373665e-01 6.642449e-01
lam_interceptsbc 3.138255e-01 0.001848241 0.09067556 1.445583e-01 2.479237e-01 3.126518e-01 3.756852e-01 4.960912e-01
sigma_interceptsbc 1.219035e+00 0.002587599 0.10155465 1.034409e+00 1.149473e+00 1.213177e+00 1.283561e+00 1.432985e+00
lam_interceptsbp 4.370213e-01 0.013425549 0.19968870 8.809602e-02 2.827428e-01 4.245305e-01 5.801684e-01 8.407862e-01
sigma_interceptsbp 2.076898e-01 0.002191782 0.03551768 1.434399e-01 1.828181e-01 2.056370e-01 2.302719e-01 2.816963e-01
b_zf -5.159007e+00 0.055477603 1.91888987 -8.838852e+00 -6.426638e+00 -5.194980e+00 -3.944072e+00 -1.240598e+00
b_zc -5.949112e+00 0.045622676 2.81093627 -1.123390e+01 -7.915563e+00 -5.960093e+00 -4.088093e+00 -2.636374e-01
b_zp -1.429105e+00 0.024939643 0.80183839 -3.010519e+00 -1.943406e+00 -1.438278e+00 -9.083322e-01 1.486301e-01
a_z 2.873508e+01 0.056776305 3.70046927 2.132495e+01 2.644504e+01 2.876046e+01 3.126842e+01 3.589489e+01
lp__ -1.102446e+04 3.555151310 47.05369124 -1.111389e+04 -1.105717e+04 -1.102583e+04 -1.099220e+04 -1.093278e+04
n_eff Rhat
sigma_y 3607.7426 1.0007873
lam_interceptsa 2169.7919 1.0034337
sigma_interceptsa 2375.0349 0.9997258
lam_interceptsbf 328.7171 1.0082229
sigma_interceptsbf 224.6513 1.0249223
lam_interceptsbc 2406.9295 0.9999574
sigma_interceptsbc 1540.3018 1.0005693
lam_interceptsbp 221.2294 1.0089693
sigma_interceptsbp 262.5998 1.0106994
b_zf 1196.3678 1.0045307
b_zc 3796.1249 1.0003183
b_zp 1033.6969 1.0024223
a_z 4247.9496 0.9993499
lp__ 175.1746 1.0185487
Ideas are:
Thought I'd stick my nose in here (thanks for this link @lizzieinvancouver !) - this is a doozy, isn't it?... My thoughts, in an almost stream-of-consciousness fashion, are:
...so, from this, maybe two ideas:
control=list(stepsize=0.5))
or something). Don't run a very very long chain with this, but keep the adapt_delta close to 1 still. Compare with other shorter chains to see if that helps.@willpearse Thanks!
I got 26 divergences from
iter=6000, warmup=5000, chains=4, cores=4, control=list(stepsize=0.5, adapt_delta=0.99))
and 25 divergences from
iter=4000, warmup=3000, chains=4, cores=4, control=list(stepsize=0.5, adapt_delta=0.99))
and 151 from
iter=4000, warmup=3000, chains=4, cores=4, control=list(stepsize=0.5, adapt_delta=0.9))
Just to check I am running one with control=list(stepsize=0.3, adapt_delta=0.99)
and control=list(stepsize=0.5, adapt_delta=0.999).
@legault and I discussed but we think maybe lamba is already transformed under the stan hood, no? Or maybe I just need more info on how you see the transformation working.
We also tried to think more on how to formulate the model (should the lambdas all be independent for example?). I think it's fine and discussed a little with Jonathan who generally agrees ...
@legault Some notes on how correlated traits evolve in phylo stuff. Basically, you can evolve them together and set the correlation between them... see sim.char
in geiger package (Continuous character -- multivariate example)
See also ... see 'Now, let's simulate the evolution of two traits on the tree under three different correlation coefficients'
And from Harmon "To simulate n continuous characters, the program requires an nxn evolutionary variance–covariance matrix. This matrix describes the expected variances and covariances among characters per unit time; characters evolve under a multivariate Brownian motion model described by this matrix. A"
Jonathan didn't think these types of correlated analyses of empirical data were super common, here's one I am across while looking for other stuff
I would agree with the correlated models not being very common (although they really should be more common in my opinion!). Off-hand I can't think how to set up a direct analogue of the above simulation for the estimation. I know you can do it in PCMBase or mvMORPH, for example, but it is a bit trickier to do than the univariate case in geiger.
I believe the current 3 slope model (i.e., 3 character model [+intercept]) assumes that slopes evolve independently across a shared tree. In this case, I think it possible to generate fake data by simulating the univariate case 3 times (with fastBM), once for each set of lambda and sig2, and bringing them all together. Maybe with this simulated data we can see what's going wrong on the Stan side of things.
iter=6000, warmup=5000, chains=4, cores=4, control=list(stepsize=0.2, adapt_delta=0.999))
19 divergences
iter=8000, warmup=7000, chains=4, cores=4, control=list(stepsize=0.2, adapt_delta=0.999))
14 divergences
iter=6000, warmup=5000, chains=4, cores=4, control=list(stepsize=0.1, adapt_delta=0.999))
3 divergences
Gaaaaaaarg, so close... I don't suppse you've considered running a couple of these at the same time and could keep the one where there are no divergences? I have a suspicion that these posteriors are going to be essentially indistinguishable from the ones with many more divergences anyway...
On Thu, 14 Jan 2021 at 18:19, lizzieinvancouver notifications@github.com wrote:
iter=6000, warmup=5000, chains=4, cores=4, control=list(stepsize=0.2, adapt_delta=0.999))
19 divergences
iter=8000, warmup=7000, chains=4, cores=4, control=list(stepsize=0.2, adapt_delta=0.999)) 14 divergences
iter=6000, warmup=5000, chains=4, cores=4, control=list(stepsize=0.1, adapt_delta=0.999)) 3 divergences
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lizzieinvancouver/pmm/issues/13#issuecomment-760379250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJNYUT6MN7OOSV5A6XUDZ3SZ4YUPANCNFSM4VTPXLWA .
@willpearse I agree! I think we have the model results, just not ones I'd feel okay publishing.... I had not considered your idea but can try it next week. For now I sent a few more runs to the server (midge!) with a mix of even smaller step sizes, longer warmup etc..
That's great; I'm just glad you've got loads of things sent off because I think that will crack it. This is a bit weird and frustrating; I am going to try and look more into this when I have time because there ought to be a general fix for this...
In case it's helpful, I've pushed code that simulates 1, 2, and 3 slope models and fits these models to the data, see adfd84d . With tight priors, we see good behavior and get back the true parameter values
@legault Just confirming that I went back to the ospree code where I wrote out the data and it does look to be z-scored!
I just pushed 3bde38c which includes a successful fit of the three slope (i.e., three trait) model to the OSPREE data. The fitted model can also be downloaded from the data folder on the Wolkovich server (testme.rds)
Values from @legault model:
mean se_mean sd 2.5% 25% 50% 75%
sigma_y 13.34 0.00 0.19 12.97 13.21 13.34 13.47
lam_interceptsa 0.35 0.00 0.10 0.16 0.27 0.34 0.42
sigma_interceptsa 17.19 0.02 1.30 14.83 16.27 17.11 18.02
lam_interceptsbf 0.67 0.01 0.21 0.22 0.52 0.69 0.84
sigma_interceptsbf 5.67 0.05 1.02 3.87 4.95 5.60 6.31
lam_interceptsbc 0.28 0.00 0.09 0.11 0.21 0.27 0.34
sigma_interceptsbc 13.56 0.03 1.09 11.59 12.81 13.51 14.26
lam_interceptsbp 0.39 0.02 0.24 0.02 0.19 0.37 0.57
sigma_interceptsbp 2.40 0.02 0.42 1.68 2.11 2.35 2.64
b_zf -5.59 0.05 2.01 -9.40 -6.92 -5.63 -4.30
b_zc -8.46 0.04 2.52 -13.38 -10.12 -8.46 -6.84
b_zp -1.48 0.02 0.78 -3.03 -1.99 -1.48 -0.98
a_z 30.30 0.05 3.58 23.17 27.98 30.38 32.67
@legault This model looks great to me -- thank you! I am adding a few notes below.
@MoralesCastilla ... it looks like we have the PMM running. We should pull the code from here back to the ospree repo, run the model and then, let's write that paper!
A few notes from Geoff we may want ...
It takes some time to run (~2 hours on midge), so you can alternatively download the fitted model from our friendly lab computer's /home/data folder ("testme.rds" 160 MB)
Attached are the estimates. The priors are relatively wide and could probably be widened if desired (see object "phypriors").
...
When I asked, wow! How did you get this to work? Geoff said, "I made/tested a number of changes with simulation code first, then applied it all to the OSPREE data. The most important change, probably, was constraining Pagel's lambda to [0, 1]." It sounds like we (I really am not smart enough to use a beta, I think @willpearse helped us there) had a beta prior in there that was not quite correctly specified for our crazy model.
Thanks for looking into this, and also for updating me. To be clear, then, it's a beta prior + a hard [0,1] constraint that fixed it? Good to know going forward.
On Thu, 8 Jul 2021 at 03:16, lizzieinvancouver @.***> wrote:
A few notes from Geoff we may want ...
It takes some time to run (~2 hours on midge), so you can alternatively download the fitted model from our friendly lab computer's /home/data folder ("testme.rds" 160 MB)
Attached are the estimates. The priors are relatively wide and could probably be widened if desired (see object "phypriors").
...
When I asked, wow! How did you get this to work? Geoff said, "I made/tested a number of changes with simulation code first, then applied it all to the OSPREE data. The most important change, probably, was constraining Pagel's lambda to [0, 1]." It sounds like we (I really am not smart enough to use a beta, I think @willpearse https://github.com/willpearse helped us there) had a beta prior in there that was not quite correctly specified for our crazy model.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lizzieinvancouver/pmm/issues/13#issuecomment-876066473, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJNYUWF6POSYMC6BP6ROFDTWUDANANCNFSM4VTPXLWA .
@willpearse Thanks to you for writing up the model! Looking back at the code, I think @legault added the beta distribution and initially it was beta(2,2) and now is beta(1,1), which means that making it more square made the difference ... I think.
Also, this all applies for our real data; we got the model running on test data with beta(2,2) and normal(0, 1) for lambda.
This is super useful to know. Well done and thank you!
On Fri, 9 Jul 2021 at 18:31, lizzieinvancouver @.***> wrote:
@willpearse https://github.com/willpearse Thanks to you for writing up the model! Looking back at the code, I think @legault https://github.com/legault added the beta distribution and initially it was beta(2,2) and now is beta(1,1), which means that making it more square made the difference ... I think.
Also, this all applies for our real data; we got the model running on test data with beta(2,2) and normal(0, 1) for lambda.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lizzieinvancouver/pmm/issues/13#issuecomment-877344706, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJNYUVF3BFCEROMZCNR5UTTW4W7PANCNFSM4VTPXLWA .
I have tried the following (see lessmini_threeslopeintercept.stan and phlyo_opsree_compact.R, there's also phlyo_opsree_serve1.R for running on midge, as these models are SLOW):
See phlyo_opsree_compact.R for more notes on other runs.
Here's what a run with
iter=9000, warmup=8000, chains=4, cores=4, control=list(adapt_delta=0.99)
looks like: