lizzieinvancouver / ospree

Budbreak review paper database
3 stars 0 forks source link

understand climates and ranges and plots and purpose #442

Open dbuona opened 1 year ago

dbuona commented 1 year ago

Here is the summaries of the the europe only models. I didn't include the species parameters because they are numeric, and numerous so overwhelming, but I can pull them if thats helpful.


betatrait= mean GDD in range
                       mean       2.5%       25%       75%     97.5%
muChillSp         -7.576980 -13.054408 -9.345380 -5.781033 -2.154739
muPhotoSp         -0.563373  -3.474378 -1.486707  0.381581  2.243119
muForceSp         -7.628377 -10.219562 -8.471022 -6.806926 -4.958249
betaTraitxForcing  5.538409   1.112836  4.103120  7.005950  9.805608
betaTraitxPhoto   -2.732013  -7.321792 -4.294596 -1.200399  1.977080
betaTraitxChill   -1.345844  -9.722267 -4.274730  1.316455  8.248155

betatrait=mean CP in Range
                       mean        2.5%         25%        75%     97.5%
muChillSp         -4.790286 -11.6215283  -6.6635188 -2.7226300  0.784639
muPhotoSp          0.563295  -3.3125592  -0.7328142  1.8340044  4.355612
muForceSp         -9.380967 -13.1665837 -10.6581786 -8.1260411 -5.637837
betaTraitxForcing  2.625253  -0.9507129   1.4478939  3.8298662  6.100051
betaTraitxPhoto   -1.759544  -5.5732867  -3.0537973 -0.5154178  2.184797
betaTraitxChill   -4.044568  -9.4417861  -6.0129643 -2.3126808  2.466190
lizzieinvancouver commented 1 year ago

@dbuona Just to write out loud (if you will) ... this says:

  1. forcing goes up (gets weaker) as GDD increases; this result is pretty damn strong
  2. photoperiod goes down (gets stronger) as GDD increases (but not as strong as forcing)
  3. chilling kind of does nothing with GDD
  4. forcing goes up (gets weaker) as chilling increases ... is that correct?
  5. photoperiod goes down (gets stronger) as chilling increases
  6. chilling goes down (gets stronger) as chilling increases

Am I interpreting this correctly?

lizzieinvancouver commented 1 year ago

@dbuona Okay. I had a few thoughts on this but no slam dunk answer (or feeling of -- 'oh, let's just do variation in GDD') ... yet.

First, the model is the basically the same as the traitors (that is, the cue values per species are what is LEFTOVER after the climate explains the cue) so we need to plot it differently to understand it. I personally find these plots that decompose the output most helpful -- see here for a traitors plot. While this may seem like a pain, it shouldn't be that much work to adapt and I am hopeful @DeirdreLoughnan will help as she is a pro at these and can help you interpret them.

And you need these plots for the main model interpretation anyway, so worth making. ... the idea for these actually came from the Rangers pop up, so you might even have a version of them somewhere.

Next, I kept wondering two things. ...

  1. How much does these cues just map to mean climate -- as in mean MAT or such (in which case, it could be a lot to fit in a paper with inter-annual variation stuff). Did we ever plot or check that? As a stand-in can we plot these again against latitude and range size or such? Could we also just map a map where we color code the mean value for the species as the color of a circle and put the circle at the centroid (some code below -- you can do this in ggplot).
  2. Sort of related to (1), if we switch to focus on interannual variation in GDD, can we use the chilling and GDD at all? Maybe a map or a plot with MAT or with range size or. ... I just wanted to think about whether some version of them could set the stage or be used in the paper, even if we focus on the GDD hypothesis. Not a must-do, just something I think we should think on.

stolen from:

http://rpsychologist.com/working-with-shapefiles-projections-and-world-maps-in-ggplot/

wmap <- readOGR("input/maps/ne_110m_land", layer="ne_110m_land")

adjust the theme

theme.tanmap <- list(theme(panel.grid.minor = element_blank(),

panel.grid.major = element_blank(),

                    panel.background = element_rect(fill = "grey90",colour = NA),
                    # plot.background = element_rect(fill=NA),
                    axis.title.x = element_blank(),
                    axis.title.y = element_blank(),
                    plot.title = element_text(size=22),
                    legend.position = "left"))

international vars, superegion

pdf("graphs/maps/adelaideRichIntlVars_bysuperregion.pdf", width=9, height=4)

ggplot() + geom_polygon(dat=wmap.df, aes(long, lat, group=group), fill="grey80") + geom_point(data=dat.super.rich.wintl, aes(x=longitude, y=latitude, size=rich, fill=perintl), colour="black", pch=21) + scale_fill_gradient2(name=expression('% International varieties'), low="papayawhip", high="darkred", guide="colorbar", na.value="NA") + scale_size(name=expression(paste('varieties ', italic('n'))), range=c(0.5, 5), breaks=c(1, 10, 30, 50, 100, 200)) + theme.tanmap dev.off()

dbuona commented 1 year ago

Some thoughts that may or may not matter while I work on this: One difference between the range model and the traits one is we don't model the trait values (eg climate variable) for each species. in traits model real mu_grand; // grand mean for trait value vector[n_spec] muSp; // species offsets

I think we decided since we have no variation it would make sense to treat this as a data rather than a parameter. But it seems important for reproducing the partitioning plots.

lizzieinvancouver commented 1 year ago

@dbuona I know but the partitioning plots that I am thinking of do not care about the trait model -- they use the output from it, just as you would take the range data as accurate (which I still think is fine a choice). It's more about visualizing the model where cues are divided into what's predicted by the range, versus just the species ID.

dbuona commented 1 year ago

@lizzieinvancouver Here is a version for variation in growing degree days to last frost. I think it is equivalent to the middle and right panel from this figure you linked above. I am not sure that it feels so helpful for interpretation yet--- but its a starting point

lizzieinvancouver commented 1 year ago

@dbuona If it is equivalent to the middle figure from the figure I linked than we would expect no relationship is the model is working. The figures from left to right should conceptually be:

  1. The range effect (one slope) with all the species plotted. Since it's one slope (the effect of range) it looks rather `perfect' if you will.
  2. The species-level intercepts in a cue -- these are in the model to make sure you don't force random species-level variation on the range effect. If these show a relationship with range, then something went wrong (@DeirdreLoughnan and I actually diagnosed a problem with the traitors model based on this).
  3. Now I get foggy, but I think the last plot is basically 1+2 -- so the species-level cue (range effect + intercept) with the range effect plotted on top.

If you have 2 plus a slope -- then you're close! But we really need the suite of plots to understand the model (right now it's hard to believe where that slope is coming from). There's also some discussion of these plots in issue #406 ... but I haven't reviewed those notes in a while.

dbuona commented 1 year ago

I still don't understand plot 1. What is the variable that is actually plotted on the y-axis? I can't find any traitors code that reproduces this plot.

lizzieinvancouver commented 1 year ago

@dbuona I found in this folder a lot of the plots (for example: chill_bdecompLNC_stanfit.RDS.pdf) and tracked down 'bdecomp' to this file: Plotting_mugrandspvsbetaSp.R here's the link search for 'bdecomp' Looks like @DeirdreLoughnan made it so can hopefully help.

dbuona commented 1 year ago

Great thanks! I think I get it now.

dbuona commented 1 year ago

I think I made a version of the same figure here (appologies that it is giant). Top to bottom it should be the same as the left to right of the example traitors figure. The top row is certainly not "rather perfect". This could either mean my understanding of the plotting code is not correct, or the environmental trait is a bad predictor? The y axis here is the betaSpcue-alphaSpcue which in theory should be the amount of the cue explained by environmental trait.

dbuona commented 1 year ago
summy.eu[grep(c("betaTrait"), rownames(summy.eu)),]
                       mean    se_mean       sd      2.5%       25%       50%        75%    97.5%     n_eff     Rhat
betaTraitxForcing  2.429029 0.04779957 2.253312 -2.199617  1.003350  2.507493  3.9032584 6.668927 2222.2598 1.001634
betaTraitxPhoto   -2.510906 0.04485891 2.170424 -6.818977 -3.930521 -2.523154 -1.0852466 1.795003 2340.9468 1.000288
betaTraitxChill   -2.439642 0.13300188 3.312856 -9.128130 -4.666905 -2.400817 -0.2791337 4.023054  620.4255 1.001772
> summy.eu[grep(c("mu"), rownames(summy.eu)),]
                mean    se_mean       sd       2.5%        25%        50%       75%     97.5%     n_eff      Rhat
muForceSp -12.553414 0.10201309 4.813804 -21.700014 -15.778053 -12.695637 -9.447962 -2.519031 2226.7164 1.0013445
muPhotoSp   4.458729 0.09288058 4.654632  -4.895115   1.442747   4.462118  7.527682 13.701370 2511.4290 0.9999435
muChillSp  -2.845760 0.26840217 6.925147 -16.178096  -7.282302  -3.044479  1.779688 10.587555  665.7114 1.0017495
muPhenoSp  32.834495 0.04370737 3.342220  26.311642  30.619700  32.757186 34.830510 39.696252 5847.3735 0.9995336
> summy.eu[grep(c("sigma"), rownames(summy.eu)),]
                   mean     se_mean        sd       2.5%        25%        50%        75%      97.5%    n_eff      Rhat
sigmaForceSp  0.5883193 0.003260942 0.1406831  0.3677279  0.4896512  0.5699347  0.6667245  0.9160240 1861.219 0.9992082
sigmaPhotoSp  0.5616012 0.002980548 0.1377709  0.3391700  0.4635674  0.5458125  0.6453066  0.8797571 2136.598 1.0015089
sigmaChillSp 11.2305546 0.075840796 3.1804638  5.9989889  8.9197959 10.9398973 13.1391451 18.2844030 1758.633 1.0004033
sigmaPhenoSp 14.0308614 0.063796625 3.2712215  9.0957357 11.7465748 13.5478050 15.7911891 21.6723283 2629.205 1.0004926
sigmapheno_y 15.1579696 0.003705234 0.2885815 14.6021199 14.9578266 15.1549837 15.3580109 15.7298022 6066.047 0.9999880
> summy.nam[grep(c("betaTrait"), rownames(summy.nam)),]
                         mean     se_mean         sd       2.5%         25%        50%         75%       97.5%     n_eff     Rhat
betaTraitxForcing -0.21310690 0.001851408 0.09165569 -0.3833622 -0.27455691 -0.2172303 -0.15659640 -0.02384808 2450.8379 1.000697
betaTraitxPhoto   -0.01268732 0.002299074 0.11819652 -0.2553377 -0.08807404 -0.0112163  0.06642758  0.21712638 2643.0373 1.000103
betaTraitxChill   -0.48698104 0.011946138 0.26610723 -1.0030128 -0.66464246 -0.4895122 -0.30862101  0.02914395  496.2018 1.002299
> summy.nam[grep(c("mu"), rownames(summy.nam)),]
                mean    se_mean       sd       2.5%       25%        50%       75%     97.5%     n_eff      Rhat
muForceSp  0.7804556 0.05956481 2.895237  -5.207827 -1.035312  0.8952954  2.779612  6.076076 2362.5923 1.0002822
muPhotoSp -2.7083731 0.06969134 3.771151 -10.266279 -5.233875 -2.6449066 -0.273296  4.663927 2928.1291 0.9994394
muChillSp -1.5625236 0.30374696 7.726802 -16.744257 -6.599994 -1.5989584  3.452723 14.071572  647.1063 1.0015023
muPhenoSp 24.4624586 0.06914461 4.525841  15.520887 21.455100 24.5396133 27.367360 33.132751 4284.3198 1.0020202
> summy.nam[grep(c("sigma"), rownames(summy.nam)),]
                   mean     se_mean        sd        2.5%        25%        50%        75%      97.5%    n_eff      Rhat
sigmaForceSp  0.2962168 0.004124424 0.1542726  0.04147637  0.1882638  0.2792778  0.3842033  0.6417090 1399.107 1.0021264
sigmaPhotoSp  0.3100518 0.005474161 0.2023459  0.01707846  0.1537265  0.2838289  0.4327141  0.7613954 1366.325 1.0030336
sigmaChillSp 18.6332389 0.050665321 3.3102196 13.08198619 16.2891622 18.3351063 20.6615835 25.7600061 4268.664 0.9999335
sigmaPhenoSp 17.4796163 0.081983332 4.2688352 10.99641721 14.4668268 16.9395152 19.7009074 27.3977472 2711.238 1.0005246
sigmapheno_y 16.1370457 0.005593793 0.4434369 15.31835565 15.8243338 16.1290072 16.4286465 17.0422128 6284.213 1.0000494
dbuona commented 1 year ago

@lizzieinvancouver I think, our disucssion of alpha_cueSp being the "amount of the cue unexplained by the range" is not precise. If alpha is the intercept of the trait sub-model, it is actually "what the beta cue estimate would be in when the range trait is 0". This is why species with low or moderate betas and big range traits get very positive alphas. Therefore it may be that the plots you are working on that show how much estimates shifts from alpha to beta is the best way to convey the magnitude of the range effect.

dbuona commented 1 year ago

@lizzieinvancouver just pushed a figure alpha_beta_comps which is a start for comparing alphas and betas in each continent. It's based on the range_fig_output.R script lines 332-339. It's probably not super helpful, but it is at least a start to the data wrangling.

lizzieinvancouver commented 1 year ago

@dbuona Thanks! This actually is pretty helpful. I might also just poke the model again to feel better that it's stable for Europe.

On that notes, did you update the 'cheap approach' plots and simple stats?

lizzieinvancouver commented 1 year ago

@dbuona Okay, I tried to work on this but I hit a couple snags ... see commit d07c5a96630bd5e24e76982ad7f2df71d7e69316 and let me know...

  1. Is the README correct? If not, please update it so I can know where to look and what to run.
  2. I deleted dplyr from this code since the code runs bbstandleadin and thus has this error:
    
    > source("source/bbstanleadin.R")
    ---------------------------------------------------------------------------------------------
    You have loaded plyr after dplyr - this is likely to cause problems.
    If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
    library(plyr); library(dplyr)
    ---------------------------------------------------------------------------------------------

3. I am getting this error early on (whether or not I load dplyr in this code), which seems weird ... I ran the code just the other week so I am not sure what is up and wanted to check with you:

source("source/bbstanleadin.R")

Attaching package: ‘dplyr’

The following objects are masked from ‘package:plyr’:

arrange, count, desc, failwith, id, mutate, rename, summarise, summarize

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Joining, by = c("genus", "species") Joining, by = c("genus", "species") Joining, by = c("genus", "species") Joining, by = c("genus", "species") Error in str(datalist.bb) : object 'datalist.bb' not found In addition: Warning messages: 1: In eval(ei, envir) : NAs introduced by coercion 2: In eval(ei, envir) : NAs introduced by coercion 3: In eval(ei, envir) : NAs introduced by coercion 4: In eval(ei, envir) : NAs introduced by coercion


4. Some other smaller errors I am getting:

jpeg("figures/clim_params.jpeg",width=8,height=6,units = "in",res=300) ggplot(ggdlf,aes(Geo.SD,Temp.SD))+geom_point(aes(color=continent))+stat_smooth(method="lm",aes(color=continent),se=FALSE,linetype="dashed",size=.4)+

  • scale_color_viridis_d(begin = 0,end=.5)+ggthemes::theme_few(base_size = 10)+ylab("Temporal standard deviations \nof GDDs to last frost")+
  • xlab("Geographic standard deviations \nof GDDs to last frost")+annotate("text", x = 40, y = 10,
  • label = "Correlation:\nEurope= 0.62 \nN. America= 0.87") Error in loadNamespace(x) : there is no package called ‘ggthemes’ dev.off() null device 1

a<-ggplot(ggdlf,aes(Temp.SD,Geo.SD))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent)) b<-ggplot(ggdlf,aes(Temp.SD,Temp.Mean))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent)) c<-ggplot(ggdlf,aes(Geo.SD,Geo.Mean))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent)) d<-ggplot(ggdlf,aes(Temp.Mean,Geo.Mean))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent))

h<-ggplot(ggdlf,aes(Temp.SD.GDD,Temp.Mean.GDD))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent))+facet_wrap(~continent,scales="free") i<-ggplot(ggdlf,aes(Geo.SD.GDD,Geo.Mean.GDD))+geom_point(aes(color=continent))+geom_smooth(method="lm",aes(color=continent))+facet_wrap(~continent,scales="free") ggpubr::ggarrange(a,b,c,d,common.legend = TRUE) Error in loadNamespace(x) : there is no package called ‘ggpubr’

g Error: object 'g' not found

lizzieinvancouver commented 1 year ago

@dbuona Quick followup from running source/bbstanleadin.R. line by line ... I think you may have edited your rangeleadin_osp.R code so that the flags do not match any of the bb.stan datasets created in source/bbstanleadin.R....

dbuona commented 1 year ago

oh dear, that is a bit disconcerting. I am pretty sure that that part of the code hasn't changed since the original series of POP UP meetings, which means it has been wrong for mostly the whole time.

lizzieinvancouver commented 1 year ago

@dbuona The bbstanleadin.R has not changed but I think some of the rangeleadin_osp.R must have as I could run it the other week in Boston and cannot run it now ... unless I was running different chunks.

dbuona commented 1 year ago

@lizzieinvancouver I have a vague recollection of when Cat set up the flags that lines 62-66 are supposed to overcome (potentially over-write?) the failed merge in the source line.

The problem that was preventing the models from running was that somewhere later I deleted the line that merges bb.stan to the range data. I've added it back at line 165, and the models (line 245, 264 etc) are humming away.

lizzieinvancouver commented 1 year ago

@lizzieinvancouver Need to do ...

lizzieinvancouver commented 1 year ago

Okay, I did most of this, see commit d9b53ad7e5cd3a1545f89140d2a0cb301ac2dadc ... the variance only really varies between Europe and NA for forcing, is that right @dbuona ?


> var(modelmeansnam$alphaforce)/var(modelmeansnam$betaforce)
[1] 0.1988014
> var(modelmeanseu$alphaforce)/var(modelmeanseu$betaforce)
[1] 0.8360621

> var(modelmeansnam$alphachill)/var(modelmeansnam$betachill)
[1] 0.8982476
> var(modelmeanseu$alphachill)/var(modelmeanseu$betachill)
[1] 0.9949799

> var(modelmeansnam$alphaphoto)/var(modelmeansnam$betaphoto)
[1] 0.9887957
> var(modelmeanseu$alphaphoto)/var(modelmeanseu$betaphoto)
[1] 0.8393795