lizzieinvancouver / ospree

Budbreak review paper database
3 stars 0 forks source link

compare population versus species variations -- and make figure #382

Open lizzieinvancouver opened 4 years ago

lizzieinvancouver commented 4 years ago

@dbuona and I have been working on this .... (somehow I did not start an issue before now)

@dbuona I think your NCP attempt looked correct (I just added a prior)! On my computer it ran pretty fast and looked close (only 23 divergent transitions, and they seemed related to b_pop, which makes sense) so I think we're close. Maybe try higher warm-up and overall iterations and (then) also higher adapt_delta? We could also ncp b_pop (remind me and I can show shinystan results on Monday, but remind me at the START to run the model).

Will push my changes in a second ....

I wrote an rstanarm model (low ESS also) but did not analyze it much. Maybe you have a better sense of how to extract info?

On first blush (very quick! Do not trust) I think pop and species effects could be of similar magnitudes... at least for the bb.stan.pop3 (do we really have 139 populations?! This seems crazy high. We must be loading some study variation into population ... ay, ugh, ay-ay-ay-ayy-ayy! Or, wait, we're probably also loading in chill and photo ...).

lizzieinvancouver commented 4 years ago

@dbuona My changes now posted, see here for comparison.

lizzieinvancouver commented 4 years ago

@dbuona Try models in rstanarm with all three cues on pop5 and see where we get ...

lizzieinvancouver commented 4 years ago

@dbuona But first! Subset to only papers with >1 provenance in study.... then use the rest of the code we have.

dbuona commented 4 years ago

ran pop5 for 8000 iter and 7000 warmup: 16 divergent transitions and low BESS and TESS sample sizes...and I am not sure if this is the output that is useful for us but here is what I got:

  Median MAD_SD

(Intercept) 41.5 7.4 force.z 2.7 3.4 chill.z -10.4 5.3 photo.z 8.3 5.9

Auxiliary parameter(s): Median MAD_SD sigma 7.9 0.2

Error terms: Groups Name Std.Dev. Corr
latbinum:pophere (Intercept) 31.3
force.z 9.8 0.67
chill.z 20.3 -0.15 -0.11
photo.z 24.3 0.60 0.30 -0.59 latbinum(Intercept) 12.3
force.z 6.9 -0.13
chill.z 7.6 -0.18 0.05
photo.z 7.9 0.05 0.01 -0.04 Residual 7.9
Num. levels: latbinum:pophere 42, latbinum 4

dbuona commented 4 years ago

and for pop 3: Rhats 1.16, TESS and BESS low, 20 divergent transitions, chains didn't converge

Median MAD_SD (Intercept) 30.0 4.9 force.z -4.8 2.9 chill.z -16.5 3.6 photo.z 2.6 3.7

Auxiliary parameter(s): Median MAD_SD sigma 9.7 0.2

Error terms: Groups Name Std.Dev. Corr
latbinum:pophere (Intercept) 25.0
force.z 7.7 0.60
chill.z 18.3 0.03 -0.15
photo.z 16.5 0.62 0.27 -0.49 latbinum (Intercept) 12.4
force.z 8.9 -0.11
chill.z 7.7 -0.18 0.56
photo.z 6.3 0.07 0.20 0.03 Residual 9.7
Num. levels: latbinum:pophere 70, latbinum 13

lizzieinvancouver commented 4 years ago

@dbuona So for pop5 we have POSITIVE forcing and photoperiod effects? And for pop3 we have positive photo?

lizzieinvancouver commented 4 years ago

@dbuona If we semi-trusted pop3 we'd want to do something like discussed here, especially "if you use Stan, then you can just generate posterior predictions from the model and see how the variance across the data changes when you include or nullify a group-specific term."

lizzieinvancouver commented 4 years ago

@cchambe12 is on this! Some updates from our meeting today:

Chilling is the most variable (good), but might co-vary the most with population (because it's all field chilling generally).... should we focus on forcing and photoperiod maybe?

Switch back to rstan code, since rstanarm is struggling to run too. We just need to check the rstan code.

lizzieinvancouver commented 4 years ago

@cchambe12 Side note: we also dreamed to look at population effects across continents (using posteriors and grouping by NA versus European spp.)

cchambe12 commented 3 years ago

@lizzieinvancouver Model is ranges/stan/nointer_3levelpop_force&photo.stan. The neffs are veryyy low for some of the sigmas. I've tried changing the priors and the distributions but this didn't fix the issue. Thanks for looking this over!

Output from the current rstan model of 4000 warmup and 5000 iterations:

mod.sum[grep("mu_b_force_sp", rownames(mod.sum)),] mean se_mean sd 2.5% 25% 50% 75% -7.7064480 0.2021413 6.6608426 -21.0832132 -11.6946003 -7.8321428 -3.7617553 97.5% n_eff Rhat 6.1045730 1085.7959306 1.0034097 mod.sum[grep("mu_b_photo_sp", rownames(mod.sum)),] mean se_mean sd 2.5% 25% 50% 75% 97.5% -1.9606170 0.1540649 4.2529167 -10.3103830 -3.7355903 -2.0618324 -0.4541093 7.3714106 n_eff Rhat 762.0198916 1.0051863 mod.sum[grep("sigma", rownames(mod.sum)),] mean se_mean sd 2.5% 25% 50% 75% 97.5% sigma_y 14.681558 0.006714841 0.3449153 14.0276970 14.444327 14.6740800 14.907262 15.377323 sigma_a_sp 9.168422 0.220980750 4.7179822 2.6501405 5.712004 8.2439798 11.791994 20.639990 sigma_a_study 18.560799 0.058072197 3.3653385 13.1611920 16.094145 18.2173641 20.570241 26.269023 sigma_b_force_sp 9.746073 0.197430280 4.5445834 3.4257886 6.336321 8.9074048 12.188660 20.660084 sigma_b_photo_sp 4.526440 0.365749694 4.0516857 0.1743911 1.710177 3.4374703 6.080753 15.594543 sigma_b_force_pop 5.601860 0.169241166 1.7341778 1.6550083 4.581963 5.6959397 6.781297 8.673552 sigma_b_photo_pop 0.846863 0.079507095 0.5994716 0.0898620 0.420295 0.7033132 1.147127 2.338394 sigma_a_pop 2.490659 0.216737611 1.5082468 0.3741930 1.254091 2.3006821 3.470325 5.884951 n_eff Rhat sigma_y 2638.47779 1.0011842 sigma_a_sp 455.83084 1.0019968 sigma_a_study 3358.31163 0.9995242 sigma_b_force_sp 529.85939 1.0032526 sigma_b_photo_sp 122.71667 1.0356808 sigma_b_force_pop 104.99660 1.0329623 sigma_b_photo_pop 56.84935 1.0478412 sigma_a_pop 48.42574 1.0602018

lizzieinvancouver commented 3 years ago

@cchambe12 Yes, this model is definitely unhappy. I ran it and reviewed your Stan code. Your Stan code looks good to me (nicely done!) so I don't think that's it. But it does look like it's hanging up on the photo pop part.

I think the photo pop model needs an NCP. NCP is the default in brms and rstanarm so it's makes sense that it's running there ...

popmodel
cchambe12 commented 3 years ago

@lizzieinvancouver I added ncp for photo and had rhats >3 and ~3000 divergent transitions and then I tried doing force and photo as ncp but things were even worse! I must be doing it wrong. Do you mind checking my code in stan/nointer_3levelpop_force&photo_ncp.stan?

Warning messages: 1: In readLines(file, warn = TRUE) : incomplete final line found on '/Users/CatherineChamberlain/Documents/git/ospree/analyses/ranges/stan/nointer_3levelwpop_force&photo_ncp.stan' 2: There were 4000 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 3: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See http://mc-stan.org/misc/warnings.html#bfmi-low 4: Examine the pairs() plot to diagnose sampling problems

5: The largest R-hat is 3.21, indicating chains have not mixed. Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#r-hat 6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable. Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#bulk-ess 7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable. Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#tail-ess

lizzieinvancouver commented 3 years ago

@cchambe12 Thanks for the quick work! I unfortunately don't have time to dig into code very soon, but I don't want to hold you up, so, I suggest (1) you cross-check against this example and maybe review this post also. If that doesn't work, then (2) maybe @FaithJones could take a look at the NCP (@FaithJones, this is for the population within species model she reviewed at Bayes the other week, perhaps you see if anything jumps out at you?)

FaithJones commented 3 years ago

@cchambe12 I can take a look at your code if you like, but I can't find it. What repo is it in?

cchambe12 commented 3 years ago

@FaithJones Great thanks Faith! I tried tweaking the code based on the above examples but it is still failing. You can find the code in analyses/ranges/stan/nointer_3levelpop_force&photo_ncp.stan and then analyses/ranges/models_stanforranges.R for the code.

FaithJones commented 3 years ago

@cchambe12 I took a look and there were a few issues with the npc, mostly that you defined b_force and b_photo twice and the raw values needed priors. The model is working better now (see code in repo), but I think you need ncp on the population level as well. I'm happy to take another look later if you like

lizzieinvancouver commented 3 years ago

@FaithJones Awesome! Thank you! And nice work @cchambe12 ... Once we're happy with the model we should keep this code elsewhere someday as a working example of a nested model in the lab (would be nice to keep the non-NCP and NCP versions).

cchambe12 commented 3 years ago

Thanks to @FaithJones the model is officially complete!!!! Wooohoo!! It's beautiful.

See analyses/ranges/notes/popmodel_output_stan for model output.

lizzieinvancouver commented 3 years ago

@cchambe12 Yay! Nice work to you and @FaithJones! You'll have to tell us how you did it sometime.

FaithJones commented 3 years ago

That's great @cchambe12! I certainly got to practice my ncp's a lot with this model. If you are happy with the model and want to keep it as an example I would suggest cleaning the script a bit (removing the text that doesn't run) and maybe annotating it to death. Also feel free to remove the "_fj" bit in the name, I just do that so I don't accidental ruin your existing code. Let me know if you want a second opinion on annotations.

lizzieinvancouver commented 3 years ago

@cchambe12 It would be great if you could add it to the labgit NCP stuff (if feasible).

@FaithJones You should feel free to edit people's files directly! If you mess it up you can git checkout master. If you copy files and give them new names you break a lot of why we use git -- the ability to see changes, track who did what and when. (Alternatively you can create a new branch and merge back for bigger edits ... that at least will preserve the version control aspect.)

cchambe12 commented 3 years ago

@FaithJones @lizzieinvancouver Cool, I have added some notes and renamed the file and removed the text we aren't running. Thanks again for all of your help!

lizzieinvancouver commented 3 years ago

Lookin for results? Check out the notes here

cchambe12 commented 3 years ago

@lizzieinvancouver @dbuona I think this might be at least closer! I've tried making a plot that shows all the variance in the model for the intercepts, forcing parameters and photoperiod parameters. See here for the figure

cchambe12 commented 3 years ago

I can't figure out how to change the mean line so it is taller!! I found some source code from the rstan repo but it's producing errors. Saving this note here for now.

lizzieinvancouver commented 3 years ago

@cchambe12 Skip it then!

Otherwise I like it -- can we make the sigmas Greek letters? (see here maybe?)

A few things we've discussed that I might forget:

  1. The effect of study being bigger than species on the intercept is cool! Also amazing how big it is relative to population. But it could be that study captures sort of the 'central latitude' or such of a study and so is somewhat getting population? Are populations nested within study? (Should they be? Maybe?!)
  2. The variance explained by species versus population for forcing looks similar! This could be true, it could also be associated with the chilling x population covariance issue that we have? So we probably need to not OVER interpret that.
  3. Meaning ... maybe our photoperiod estimate is our best one? On the one hand photoperiod is a small effect size often so perhaps not the ideal cue to have, on the other hand it is the one that's supposed to have strong local adaptation.
cchambe12 commented 3 years ago

@lizzieinvancouver I have added Greek letters to the plot so it should look a bit nicer now! Here's the new plot.

I am going to leave this open so we can reference the notes you have here @lizzieinvancouver for when we write this up.