lizzieinvancouver / ospree

Budbreak review paper database
3 stars 0 forks source link

try other chill metrics #461

Closed lizzieinvancouver closed 9 months ago

lizzieinvancouver commented 9 months ago

@MoralesCastilla I am fairly sure you lose a little data in the switch to chill portions, but not so much.

I ran the model using chill portions, I'll paste my code below so you can see if I ran it correctly, but I did not get output similar to yours. Here's what I got:


> fitsumdf
                         mean     se_mean        sd         2.5%        25%        50%        75%      97.5%      n_eff      Rhat
a_z                30.1027713 0.031799822 3.3858612  23.34411055 27.8650414 30.0998817 32.3480228 36.6529231 11336.7591 0.9997577
sigma_interceptsa  16.0143611 0.013735382 1.1248303  14.02051544 15.2263705 15.9366765 16.7164742 18.4607993  6706.4479 1.0005908
b_zf               -6.2263619 0.034566309 2.0246099 -10.23411238 -7.5172005 -6.2622127 -4.9421153 -2.1521314  3430.6521 1.0003064
sigma_interceptsbf  5.5092680 0.030662326 0.9864638   3.86438862  4.7939866  5.4146867  6.1116834  7.6859386  1035.0282 1.0065654
lam_interceptsbf    0.6637477 0.009248437 0.2104777   0.19756510  0.5208157  0.6959203  0.8314127  0.9752169   517.9354 1.0038782
b_zc               -5.8394329 0.021615978 1.9723460  -9.72908632 -7.1211915 -5.8413241 -4.5917564 -1.8610500  8325.6170 0.9995920
sigma_interceptsbc  6.8519615 0.014529716 0.7577150   5.47108120  6.3332129  6.8216883  7.3311425  8.4412526  2719.5525 1.0017598
lam_interceptsbc    0.5222803 0.002425924 0.1307011   0.26630661  0.4307091  0.5253886  0.6153322  0.7670357  2902.7131 1.0016767
b_zp               -1.3064922 0.015680580 0.7967604  -2.87599988 -1.8113904 -1.3252995 -0.8241956  0.3635628  2581.8514 1.0014319
sigma_interceptsbp  2.3563025 0.015670879 0.3947977   1.64003651  2.0871973  2.3330455  2.6013665  3.1907407   634.6912 1.0068699
lam_interceptsbp    0.4555378 0.013413403 0.2499087   0.03255231  0.2540416  0.4437589  0.6477485  0.9315895   347.1241 1.0107705
sigma_y            12.0837620 0.001675559 0.1710770  11.74908072 11.9649435 12.0839568 12.1955375 12.4262553 10424.7002 0.9997825
lizzieinvancouver commented 9 months ago

And here's the code, just so I have a record:

> options(mc.cores = parallel::detectCores())
> 
> #'###############################
> # Flags for how to run the code #
> #'###############################
> runmodels <- FALSE
> runbbstanleadin <- T # leave as false to speed up Supp and ms. compilation
> runwithchillports <- T
> 
> #'######################################
> #### get data through bbstanleadin ####
> #'######################################
> 
> if(runbbstanleadin){
+ # Flags to choose for bbstanleadin.R #
+ setwd("..//bb_analysis") 
+ getwd()
+ # Master flags! Here you pick if you want the flags for the main model (figure in main text) versus the all spp model (supp)
+ use.flags.for.mainmodel <- FALSE
+ use.flags.for.allsppmodel <- TRUE
+ use.yourown.flagdesign <- FALSE
+ nocrops <- TRUE
+ agiosponly <- TRUE
+ 
+ if(use.flags.for.mainmodel==TRUE & use.flags.for.allsppmodel | use.flags.for.mainmodel==TRUE & use.yourown.flagdesign |
+    use.yourown.flagdesign  & use.flags.for.allsppmodel | use.flags.for.mainmodel==TRUE & use.flags.for.allsppmodel
+    & use.yourown.flagdesign) print("ALERT! You have set too many master flags to true, you must pick only one!")
+ 
+ if(use.flags.for.mainmodel){
+   use.chillports = FALSE
+   use.zscore = TRUE
+   use.allspp =FALSE # for the main model this is false
+   use.multcuespp = FALSE
+   use.cropspp = FALSE
+   # Default is species complex use  alltypes of designs
+   use.expramptypes.fp = TRUE
+   use.exptypes.fp = FALSE
+   use.expchillonly = FALSE
+ }
+ 
+ if(use.flags.for.allsppmodel){
+   use.chillports = FALSE
+   use.zscore = TRUE
+   use.allspp = TRUE
+   use.multcuespp = FALSE
+   use.cropspp = TRUE
+   use.expramptypes.fp = FALSE
+   use.exptypes.fp = FALSE
+   use.expchillonly = FALSE
+ }
+ 
+ if(use.yourown.flagdesign){
+   use.chillports = F # change to false for using utah instead of chill portions (most models use chill portions z)
+   use.zscore = TRUE # change to false to use raw predictors
+   
+   # Default is species complex and no crops
+   use.allspp = F
+   use.multcuespp = FALSE
+   use.cropspp = FALSE
+   
+   # Default is species complex use  alltypes of designs
+   use.expramptypes.fp = TRUE
+   use.exptypes.fp = FALSE
+   
+   #Default is all chilling data
+   use.expchillonly = FALSE # change to true for only experimental chilling 
+   #note: with only exp chilling, there is only exp photo and force too.
+   #also: subsetting to exp chill only reduces dataset to 3 species, <9 studies
+ }
+ 
+ source("..//bb_analysis/source/bbstanleadin.R")
+ 
+ namesdat <- unique(paste(bb.stan$genus,bb.stan$species,sep="_"))
+ bb.stan$spps <- paste(bb.stan$genus,bb.stan$species,sep="_")
+ bb.stan$phylo <- paste(bb.stan$genus,bb.stan$species,sep="_")
+ 
+ 
+ #'###################################
+ #### get phylogeny              ####
+ #'###################################
+ 
+ setwd("..//phylogeny") 
+ source("source/get_phylo_models.R")
+ 
+ ## read and pre-process phylogeny
+ #phylo <- read.tree("../../data/phylogeny/SBphylo_62complex.tre")
+ #phylo <- read.tree("../../data/phylogeny/SBphylo_101sps.tre")
+ phylo <- phy.plants.ospree
+ 
+ 
+ namesphy <- phylo$tip.label
+ phylo <- force.ultrametric(phylo, method="extend")
+ phylo$node.label <- seq(1,length(phylo$node.label),1)
+ is.ultrametric(phylo)
+ #plot(phylo, cex=0.7)
+ VCVPHY <- vcv.phylo(phylo,corr=TRUE)
+ 
+ 
+ 
+ ## deal with subgrouping
+ 
+ if(nocrops & agiosponly){
+   gymno <- c("Metasequoia_glyptostroboides",  "Pseudotsuga_menziesii","Larix_laricina",
+              "Larix_gmelinii", "Larix_decidua" ,"Larix_kaempferi",   
+              "Pinus_nigra","Pinus_sylvestris","Pinus_banksiana",  
+              "Pinus_contorta","Pinus_wallichiana","Pinus_strobus", 
+              "Picea_abies"   ,"Picea_mariana" ,"Picea_glauca" ,
+              "Cedrus_libani" ,"Abies_alba"    ,"Abies_homolepis","Ginkgo_biloba")
+   croplist <- read.csv("../../data/croplist/agricultural_species.csv")
+   cropgymno <- c(croplist$Species_name,gymno)
+   bb.stan$crops <- ifelse(bb.stan$spps %in% cropgymno, "cropgymno","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="cropgymno")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ if(nocrops & !agiosponly){
+   croplist <- read.csv("../../data/croplist/agricultural_species.csv")
+   bb.stan$crops <- ifelse(bb.stan$spps %in% croplist$Species_name, "crop","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="crop")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ 
+ if(!nocrops & agiosponly){
+   gymno <- c("Metasequoia_glyptostroboides",  "Pseudotsuga_menziesii","Larix_laricina",
+              "Larix_gmelinii", "Larix_decidua" ,"Larix_kaempferi",   
+              "Pinus_nigra","Pinus_sylvestris","Pinus_banksiana",  
+              "Pinus_contorta","Pinus_wallichiana","Pinus_strobus", 
+              "Picea_abies"   ,"Picea_mariana" ,"Picea_glauca" ,
+              "Cedrus_libani" ,"Abies_alba"    ,"Abies_homolepis","Ginkgo_biloba")
+   cropgymno <- c(gymno)
+   bb.stan$crops <- ifelse(bb.stan$spps %in% cropgymno, "cropgymno","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="cropgymno")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ 
+ # Get spps and VCVPHY in same order
+ # bb.stan$spps[phylo$tip.label]
+ phylo$tip.label
+ d <- bb.stan[match(phylo$tip.label, bb.stan$spps),] # hmmm, only gives ONE match
+ 
+ phymatch <- data.frame(tip=phylo$tip.label, sppnum=c(1:length(phylo$tip.label)))
+ d <- merge(bb.stan, phymatch, by.x="spps", by.y="tip")
+ d <- d[order(d$sppnum),]
+ # Tilia_cordata versus Tilia_Cordata in phylo
+ nspecies <- max(d$sppnum)
+ 
+ ## remove outliers
+ # d$resp
+ head(d)
+ ff = subset(d,latbi %in% c("Populus_balsamifera","Populus_tremuloides"))
+ d = subset(d,!latbi %in% c("Populus_balsamifera","Populus_tremuloides"))
+ nspecies = 192
+ phylo <- drop.tip(phylo, c("Populus_balsamifera","Populus_tremuloides"))
+ d$sppnum <- as.numeric(as.factor(d$sppnum))
+ 
+ 
+ ## remove names of species that are wrong (e.g. Acer pseudolatanus) Malyshev2018
+ idswrong = which(d$spps == "Acer_pseudolatauns")
+ d$spps[idswrong] = "Acer_pseudoplatanus"
+ d$species[idswrong] = "pseudoplatanus"
+ d$latbi[idswrong] = "Acer_pseudoplatanus"
+ d$phylo[idswrong] = "Acer_pseudoplatanus"
+ 
+ #d$sppnum[which(d$latbi=="Acer_pseudoplatanus")]
+ d$sppnum[idswrong] = 127
+ d$sppnum[which(d$sppnum>137)] = d$sppnum[which(d$sppnum>137)]-1
+ 
+ nspecies = 191
+ phylo <- drop.tip(phylo, "Acer_pseudolatauns")
+ 
+ 
+ 
+ ## remove names of species that are wrong (e.g. Juglans spp) 
+ idswrong = which(d$spps == "Juglans_spp")
+ d <- d[-idswrong,]
+ phylo <- drop.tip(phylo, "Juglans_spp")
+ 
+ nspecies <- length(phylo$tip.label)
+ phymatch2 <- data.frame(tip=phylo$tip.label, sppnum=c(1:length(phylo$tip.label)))
+ d2 <- merge(d, phymatch2, by.x="spps", by.y="tip")
+ d2 <- d2[order(d2$sppnum.y),]
+ d2$sppnum <- d2$sppnum.y
+ d <- d2
+ 
+ # save for faster loading-compiling
+ #write.csv(d,file = "input/datasetforphyloms.csv")
+ #write.tree(phylo, file = "input/phyloforphyloms.tre")
+ 
+ } else {
+   
+   d = read.csv("input/datasetforphyloms.csv")
+   phylo = read.tree("input/phyloforphyloms.tre")
+   
+ }
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
List of 7
 $ y    : num [1:4442] 56.4 52.2 59.6 49.2 61.5 ...
 $ chill: num [1:4442] 0.05553 0.00235 0.05553 0.00235 0.05553 ...
 $ force: num [1:4442] -1.8 -1.8 -1.8 -1.8 -1.8 ...
 $ photo: num [1:4442] -0.725 -0.438 -0.725 -0.438 -0.725 ...
 $ sp   : num [1:4442] 1 1 1 1 1 1 1 1 1 1 ...
 $ N    : int 4442
 $ n_sp : int 233
[1] "Unique forcing types are ..."
[1] "ramped" "exp"    "amb"   
[1] "Unique photo types are ..."
[1] "ramped" "exp"    "amb"    "none"  
[1] "Unique chilling types are ..."
[1] "fldest"  "bothest" "bothrep" "exp"    
[1] "Range of chilling is ..."
[1] -3.274617  4.558866

Number of species in tree before: 0
Number of species in tree now:    24 
***************************************************************
*                          Note:                              *
*    force.ultrametric does not include a formal method to    *
*    ultrametricize a tree & should only be used to coerce    *
*   a phylogeny that fails is.ultramtric due to rounding --   *
*    not as a substitute for formal rate-smoothing methods.   *
***************************************************************
Warning messages:
1: In eval(ei, envir) : NAs introduced by coercion
2: In eval(ei, envir) : NAs introduced by coercion
3: In eval(ei, envir) : NAs introduced by coercion
4: In eval(ei, envir) : NAs introduced by coercion
> 
> if(runwithchillports){
+   
+   d$chill.z = as.numeric(scale(d$chill.ports))
+   
+ }
> runwithchillports
[1] TRUE
> fit <- stan("stan/uber_threeslopeintercept_modified_cholesky_updatedpriors.stan",
+                data=list(N=nrow(d),
+                          n_sp=nspecies,
+                          sp=d$sppnum,
+                          x1=d$force.z,
+                          x2 = d$chill.z,
+                          x3=d$photo.z,
+                          y=d$resp,
+                          Vphy=vcv(phylo, corr = TRUE)),
+                iter = 4000,
+                warmup = 2000, # half the iter as warmp is default, but leaving in case we want to change
+                chains = 4,
+                seed = 117 
+ )
Warning message:
Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess 
> 
lizzieinvancouver commented 9 months ago

A plot of d$chill.z versus d$chill.ports shows a perfect match so I think am using chill portions ... I worry perhaps the lambda for photo with this model is not so stable. The chains do not look good:

Screen Shot 2023-10-02 at 6 12 54 PM

And both the lambda for forcing and photo search the full space it feels (but on quick glance, I did not see correlated parameters or bad geometries): Screen Shot 2023-10-02 at 6 13 41 PM

In contrast, lambda for chilling looks better:

Screen Shot 2023-10-02 at 6 13 24 PM

lizzieinvancouver commented 9 months ago

My take-home (and chatting with J) right now is a few things ...

  1. I will run it all again to check what I get. Can you check I ran it correctly? If you have time, you should re-run it too.
  2. This model is not giving a lot of insight on lambda for photoperiod, but a main text figure shows the same for the full data -- this chill.ports model is just fitting poorly enough that I feel slightly less good about using it. Either way, we should remember in writing that we have pretty poor inference for lambda, especially for photoperiod (but our sigmas have much better inference); we should be sure to not emphasize too much the differences in lambda or do so by caveating it, 'if these relationships hold, they suggest...' or something.
  3. I think I have long said this, but lambda is usually reported as a POINT estimate -- so we have little insight into what the uncertainty is around it in other papers/analyses ... No edit to ms on this, I just find it interesting.
lizzieinvancouver commented 9 months ago

Note to myself: I run the models running lines 1-245 in phylo_ospree_compact4_angiogymno_updateprior.R then running this snippet (which I have not saved anywhere else: update, saved it as phylo_ospree_compact4_angiogymno_updatepriorSNIPPET.R on my machine):

plot(d$chill.z~d$chill.ports)
fit <- stan("stan/uber_threeslopeintercept_modified_cholesky_updatedpriors.stan",
               data=list(N=nrow(d),
                         n_sp=nspecies,
                         sp=d$sppnum,
                         x1=d$force.z,
                         x2 = d$chill.z,
                         x3=d$photo.z,
                         y=d$resp,
                         Vphy=vcv(phylo, corr = TRUE)),
               iter = 4000,
               warmup = 2000, # half the iter as warmp is default, but leaving in case we want to change
               chains = 4,
               seed = 117 
)

## Save fitted posterior
saveRDS(fit, "output/fit_priorupdate_noout_angio191_chillports.rds")
lizzieinvancouver commented 9 months ago

@MoralesCastilla I still can't replicate your results. Here's the the new output:

> fitsumdf
                         mean     se_mean        sd         2.5%        25%        50%        75%      97.5%      n_eff      Rhat
a_z                30.0994422 0.032249438 3.4700412  23.25960878 27.8050338 30.1206503 32.3719197 36.9090218 11577.7709 0.9999126
sigma_interceptsa  15.9944861 0.013850013 1.1393839  13.99459758 15.1966123 15.9178731 16.6893054 18.4448650  6767.6795 0.9997914
b_zf               -6.2625341 0.034971307 2.0134123 -10.28906442 -7.5351323 -6.3214023 -5.0032743 -2.1666940  3314.6809 1.0012676
sigma_interceptsbf  5.4647249 0.029352576 0.9617572   3.82916218  4.7945790  5.3707028  6.0583472  7.5584786  1073.5899 1.0025706
lam_interceptsbf    0.6543217 0.011677593 0.2201085   0.17487119  0.5026808  0.6822129  0.8320320  0.9795620   355.2769 1.0204661
b_zc               -5.7936568 0.020827319 1.8875649  -9.44856608 -7.0956537 -5.7989241 -4.5491309 -2.0338019  8213.6662 0.9998402
sigma_interceptsbc  6.8329285 0.014857901 0.7555807   5.48564498  6.3036900  6.7917145  7.3066734  8.4475695  2586.1087 1.0010086
lam_interceptsbc    0.5209706 0.002556019 0.1314700   0.25872789  0.4289137  0.5226537  0.6156607  0.7662715  2645.6076 1.0007705
b_zp               -1.2931102 0.015455992 0.7885837  -2.80881031 -1.7989506 -1.3089317 -0.8090140  0.3702451  2603.1656 1.0018259
sigma_interceptsbp  2.3538291 0.015796493 0.3980147   1.64228491  2.0794005  2.3314058  2.5988845  3.1873439   634.8584 1.0065572
lam_interceptsbp    0.4553807 0.011050049 0.2412492   0.03742566  0.2615208  0.4525614  0.6413418  0.8995988   476.6540 1.0049442
sigma_y            12.0875302 0.001531633 0.1680546  11.75761995 11.9731525 12.0868308 12.1998344 12.4241307 12039.0239 1.0001978

And the full output of what I did:

R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[R.app GUI 1.78 (8075) aarch64-apple-darwin20]

> ## Started mid November 2022 ##
> ## From files started September 2021 (that copied Nacho's Phylo_ospree_reanalyses.R code)
> ## By Nacho, with some edits by Lizzie ##
> 
> ## Runs (or reads) the phylogeny models, extracts some output
> ## Does some basic plotting
> 
> rm(list=ls())
> options(stringsAsFactors = FALSE)
> 
> # Setting working directory. Add in your own path in an if statement for your file structure
> if(length(grep("Lizzie", getwd())>0)) { 
+   setwd("~/Documents/git/projects/treegarden/budreview/ospree/analyses/phylogeny") 
+ } else if (length(grep("ailene", getwd()))>0) {setwd("/Users/aileneettinger/git/ospree/analyses/phylogeny")
+ }else if(length(grep("Ignacio", getwd()))>0) { 
+   setwd("~/GitHub/ospree/analyses/phylogeny") 
+ } else if(length(grep("catchamberlain", getwd()))>0) { 
+   setwd("~/Documents/git/ospree/analyses/phylogeny") 
+ } else if(length(grep("danielbuonaiuto", getwd()))>0) { 
+   setwd("~/Documents/git/ospree/analyses/phylogeny") 
+ }else setwd("~/Documents/git/projects/treegarden/budreview/ospree/analyses/phylogeny")
> 
> 
> # Loading packages
> library(caper)
Loading required package: ape
Loading required package: MASS
Loading required package: mvtnorm
> library(pez)
> library(phytools)
Loading required package: maps
> library(rstan)
Loading required package: StanHeaders
Loading required package: ggplot2
rstan (Version 2.21.5, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
> library(shinystan)
Loading required package: shiny

This is shinystan version 2.6.0

> library(plyr)

Attaching package: ‘plyr’

The following object is masked from ‘package:maps’:

    ozone

> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:plyr’:

    arrange, count, desc, failwith, id, mutate, rename, summarise, summarize

The following object is masked from ‘package:MASS’:

    select

The following object is masked from ‘package:ape’:

    where

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> 
> options(mc.cores = parallel::detectCores())
> 
> #'###############################
> # Flags for how to run the code #
> #'###############################
> runmodels <- FALSE
> runbbstanleadin <- T # leave as false to speed up Supp and ms. compilation
> runwithchillports <- T
> 
> #'######################################
> #### get data through bbstanleadin ####
> #'######################################
> 
> if(runbbstanleadin){
+ # Flags to choose for bbstanleadin.R #
+ setwd("..//bb_analysis") 
+ getwd()
+ # Master flags! Here you pick if you want the flags for the main model (figure in main text) versus the all spp model (supp)
+ use.flags.for.mainmodel <- FALSE
+ use.flags.for.allsppmodel <- TRUE
+ use.yourown.flagdesign <- FALSE
+ nocrops <- TRUE
+ agiosponly <- TRUE
+ 
+ if(use.flags.for.mainmodel==TRUE & use.flags.for.allsppmodel | use.flags.for.mainmodel==TRUE & use.yourown.flagdesign |
+    use.yourown.flagdesign  & use.flags.for.allsppmodel | use.flags.for.mainmodel==TRUE & use.flags.for.allsppmodel
+    & use.yourown.flagdesign) print("ALERT! You have set too many master flags to true, you must pick only one!")
+ 
+ if(use.flags.for.mainmodel){
+   use.chillports = FALSE
+   use.zscore = TRUE
+   use.allspp =FALSE # for the main model this is false
+   use.multcuespp = FALSE
+   use.cropspp = FALSE
+   # Default is species complex use  alltypes of designs
+   use.expramptypes.fp = TRUE
+   use.exptypes.fp = FALSE
+   use.expchillonly = FALSE
+ }
+ 
+ if(use.flags.for.allsppmodel){
+   use.chillports = FALSE
+   use.zscore = TRUE
+   use.allspp = TRUE
+   use.multcuespp = FALSE
+   use.cropspp = TRUE
+   use.expramptypes.fp = FALSE
+   use.exptypes.fp = FALSE
+   use.expchillonly = FALSE
+ }
+ 
+ if(use.yourown.flagdesign){
+   use.chillports = F # change to false for using utah instead of chill portions (most models use chill portions z)
+   use.zscore = TRUE # change to false to use raw predictors
+   
+   # Default is species complex and no crops
+   use.allspp = F
+   use.multcuespp = FALSE
+   use.cropspp = FALSE
+   
+   # Default is species complex use  alltypes of designs
+   use.expramptypes.fp = TRUE
+   use.exptypes.fp = FALSE
+   
+   #Default is all chilling data
+   use.expchillonly = FALSE # change to true for only experimental chilling 
+   #note: with only exp chilling, there is only exp photo and force too.
+   #also: subsetting to exp chill only reduces dataset to 3 species, <9 studies
+ }
+ 
+ source("..//bb_analysis/source/bbstanleadin.R")
+ 
+ namesdat <- unique(paste(bb.stan$genus,bb.stan$species,sep="_"))
+ bb.stan$spps <- paste(bb.stan$genus,bb.stan$species,sep="_")
+ bb.stan$phylo <- paste(bb.stan$genus,bb.stan$species,sep="_")
+ 
+ 
+ #'###################################
+ #### get phylogeny              ####
+ #'###################################
+ 
+ setwd("..//phylogeny") 
+ source("source/get_phylo_models.R")
+ 
+ ## read and pre-process phylogeny
+ #phylo <- read.tree("../../data/phylogeny/SBphylo_62complex.tre")
+ #phylo <- read.tree("../../data/phylogeny/SBphylo_101sps.tre")
+ phylo <- phy.plants.ospree
+ 
+ 
+ namesphy <- phylo$tip.label
+ phylo <- force.ultrametric(phylo, method="extend")
+ phylo$node.label <- seq(1,length(phylo$node.label),1)
+ is.ultrametric(phylo)
+ #plot(phylo, cex=0.7)
+ VCVPHY <- vcv.phylo(phylo,corr=TRUE)
+ 
+ 
+ 
+ ## deal with subgrouping
+ 
+ if(nocrops & agiosponly){
+   gymno <- c("Metasequoia_glyptostroboides",  "Pseudotsuga_menziesii","Larix_laricina",
+              "Larix_gmelinii", "Larix_decidua" ,"Larix_kaempferi",   
+              "Pinus_nigra","Pinus_sylvestris","Pinus_banksiana",  
+              "Pinus_contorta","Pinus_wallichiana","Pinus_strobus", 
+              "Picea_abies"   ,"Picea_mariana" ,"Picea_glauca" ,
+              "Cedrus_libani" ,"Abies_alba"    ,"Abies_homolepis","Ginkgo_biloba")
+   croplist <- read.csv("../../data/croplist/agricultural_species.csv")
+   cropgymno <- c(croplist$Species_name,gymno)
+   bb.stan$crops <- ifelse(bb.stan$spps %in% cropgymno, "cropgymno","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="cropgymno")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ if(nocrops & !agiosponly){
+   croplist <- read.csv("../../data/croplist/agricultural_species.csv")
+   bb.stan$crops <- ifelse(bb.stan$spps %in% croplist$Species_name, "crop","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="crop")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ 
+ if(!nocrops & agiosponly){
+   gymno <- c("Metasequoia_glyptostroboides",  "Pseudotsuga_menziesii","Larix_laricina",
+              "Larix_gmelinii", "Larix_decidua" ,"Larix_kaempferi",   
+              "Pinus_nigra","Pinus_sylvestris","Pinus_banksiana",  
+              "Pinus_contorta","Pinus_wallichiana","Pinus_strobus", 
+              "Picea_abies"   ,"Picea_mariana" ,"Picea_glauca" ,
+              "Cedrus_libani" ,"Abies_alba"    ,"Abies_homolepis","Ginkgo_biloba")
+   cropgymno <- c(gymno)
+   bb.stan$crops <- ifelse(bb.stan$spps %in% cropgymno, "cropgymno","nocrop")
+   cropspps <- unique(bb.stan$spps[which(bb.stan$crops=="cropgymno")])
+   bb.stan <- subset(bb.stan, crops == "nocrop")
+   phylo <- drop.tip(phylo, cropspps)
+   VCVPHY<-vcv.phylo(phylo,corr=T)
+ } 
+ 
+ 
+ # Get spps and VCVPHY in same order
+ # bb.stan$spps[phylo$tip.label]
+ phylo$tip.label
+ d <- bb.stan[match(phylo$tip.label, bb.stan$spps),] # hmmm, only gives ONE match
+ 
+ phymatch <- data.frame(tip=phylo$tip.label, sppnum=c(1:length(phylo$tip.label)))
+ d <- merge(bb.stan, phymatch, by.x="spps", by.y="tip")
+ d <- d[order(d$sppnum),]
+ # Tilia_cordata versus Tilia_Cordata in phylo
+ nspecies <- max(d$sppnum)
+ 
+ ## remove outliers
+ # d$resp
+ head(d)
+ ff = subset(d,latbi %in% c("Populus_balsamifera","Populus_tremuloides"))
+ d = subset(d,!latbi %in% c("Populus_balsamifera","Populus_tremuloides"))
+ nspecies = 192
+ phylo <- drop.tip(phylo, c("Populus_balsamifera","Populus_tremuloides"))
+ d$sppnum <- as.numeric(as.factor(d$sppnum))
+ 
+ 
+ ## remove names of species that are wrong (e.g. Acer pseudolatanus) Malyshev2018
+ idswrong = which(d$spps == "Acer_pseudolatauns")
+ d$spps[idswrong] = "Acer_pseudoplatanus"
+ d$species[idswrong] = "pseudoplatanus"
+ d$latbi[idswrong] = "Acer_pseudoplatanus"
+ d$phylo[idswrong] = "Acer_pseudoplatanus"
+ 
+ #d$sppnum[which(d$latbi=="Acer_pseudoplatanus")]
+ d$sppnum[idswrong] = 127
+ d$sppnum[which(d$sppnum>137)] = d$sppnum[which(d$sppnum>137)]-1
+ 
+ nspecies = 191
+ phylo <- drop.tip(phylo, "Acer_pseudolatauns")
+ 
+ 
+ 
+ ## remove names of species that are wrong (e.g. Juglans spp) 
+ idswrong = which(d$spps == "Juglans_spp")
+ d <- d[-idswrong,]
+ phylo <- drop.tip(phylo, "Juglans_spp")
+ 
+ nspecies <- length(phylo$tip.label)
+ phymatch2 <- data.frame(tip=phylo$tip.label, sppnum=c(1:length(phylo$tip.label)))
+ d2 <- merge(d, phymatch2, by.x="spps", by.y="tip")
+ d2 <- d2[order(d2$sppnum.y),]
+ d2$sppnum <- d2$sppnum.y
+ d <- d2
+ 
+ # save for faster loading-compiling
+ #write.csv(d,file = "input/datasetforphyloms.csv")
+ #write.tree(phylo, file = "input/phyloforphyloms.tre")
+ 
+ } else {
+   
+   d = read.csv("input/datasetforphyloms.csv")
+   phylo = read.tree("input/phyloforphyloms.tre")
+   
+ }
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
Joining with `by = join_by(genus, species)`
List of 7
 $ y    : num [1:4442] 56.4 52.2 59.6 49.2 61.5 ...
 $ chill: num [1:4442] 0.05553 0.00235 0.05553 0.00235 0.05553 ...
 $ force: num [1:4442] -1.8 -1.8 -1.8 -1.8 -1.8 ...
 $ photo: num [1:4442] -0.725 -0.438 -0.725 -0.438 -0.725 ...
 $ sp   : num [1:4442] 1 1 1 1 1 1 1 1 1 1 ...
 $ N    : int 4442
 $ n_sp : int 233
[1] "Unique forcing types are ..."
[1] "ramped" "exp"    "amb"   
[1] "Unique photo types are ..."
[1] "ramped" "exp"    "amb"    "none"  
[1] "Unique chilling types are ..."
[1] "fldest"  "bothest" "bothrep" "exp"    
[1] "Range of chilling is ..."
[1] -3.274617  4.558866

Number of species in tree before: 0
Number of species in tree now:    24 
***************************************************************
*                          Note:                              *
*    force.ultrametric does not include a formal method to    *
*    ultrametricize a tree & should only be used to coerce    *
*   a phylogeny that fails is.ultramtric due to rounding --   *
*    not as a substitute for formal rate-smoothing methods.   *
***************************************************************
Warning messages:
1: In eval(ei, envir) : NAs introduced by coercion
2: In eval(ei, envir) : NAs introduced by coercion
3: In eval(ei, envir) : NAs introduced by coercion
4: In eval(ei, envir) : NAs introduced by coercion
> 
> if(runwithchillports){
+   
+   d$chill.z = as.numeric(scale(d$chill.ports))
+   
+ }
> 
> par(mfrow=c(1,2))
> plot(d$chill.z~d$chill)
> plot(d$chill.z~d$chill.ports)
> fit <- stan("stan/uber_threeslopeintercept_modified_cholesky_updatedpriors.stan",
+                data=list(N=nrow(d),
+                          n_sp=nspecies,
+                          sp=d$sppnum,
+                          x1=d$force.z,
+                          x2 = d$chill.z,
+                          x3=d$photo.z,
+                          y=d$resp,
+                          Vphy=vcv(phylo, corr = TRUE)),
+                iter = 4000,
+                warmup = 2000, # half the iter as warmp is default, but leaving in case we want to change
+                chains = 4
+ )
Warning message:
Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess 
> 
> ## Save fitted posterior
> saveRDS(fit, "output/fit_priorupdate_noout_angio191_chillports.rds")
> fitsum <- summary(fit, pars = list("a_z", "sigma_interceptsa", 
+                             "b_zf", "sigma_interceptsbf", "lam_interceptsbf", 
+                             "b_zc", "sigma_interceptsbc", "lam_interceptsbc",
+                             "b_zp", "sigma_interceptsbp", "lam_interceptsbp","sigma_y"))$summary
> 
> fitsumdf <- as.data.frame(fitsum)
MoralesCastilla commented 9 months ago

@lizzieinvancouver

Ok, I re-run the models and get the same (or very similar) results to yours:

> tableresults.estv2[c(1,4,7,10,2,5,8,11,3,6,9,12,13),c(1,3,4,6,8:10)]
                         mean        sd         2.5%        50%      97.5%      n_eff      Rhat
a_z                30.0672837 3.4646087  23.25746435 30.0850200 37.0500217 13454.5449 0.9998030
b_zf               -6.2066952 2.0263071 -10.13908061 -6.2485597 -2.0717981  3322.8284 1.0005258
b_zc               -5.8059195 1.9805480  -9.58334426 -5.8553986 -1.7650128  8346.1465 0.9997170
b_zp               -1.3297675 0.7704224  -2.81399217 -1.3571314  0.2365078  2598.4996 1.0014778
lam_interceptsa     0.3549471 0.0987703   0.17178234  0.3522774  0.5569254  7138.1060 0.9996513
lam_interceptsbf    0.6563615 0.2077804   0.20233388  0.6837178  0.9698824   428.3387 1.0042654
lam_interceptsbc    0.5215675 0.1313591   0.26036186  0.5236987  0.7708913  2850.9089 1.0001434
lam_interceptsbp    0.4569472 0.2460340   0.03709492  0.4469028  0.9270096   335.0013 1.0256626
sigma_interceptsa  16.0030639 1.1419172  13.95740950 15.9348759 18.4287788  6926.4931 1.0000161
sigma_interceptsbf  5.4431744 0.9797794   3.77660280  5.3509133  7.5662671   919.2705 1.0031442
sigma_interceptsbc  6.8883763 0.7452105   5.55930979  6.8486656  8.4749448  2579.7925 1.0006397
sigma_interceptsbp  2.3568704 0.3958031   1.63562733  2.3339438  3.2142969   754.2541 1.0076225
sigma_y            12.0838707 0.1699815  11.76076787 12.0804787 12.4230406  9826.0790 0.9996198

The reason you could not replicate the results I sent (and talked about) yesterday is that I was not reordering the dataset after removing Juglans_spp so, whatever the species after that in the dataset occupied its place. Anyways, it seems solved now. And these results are quite similar to results with Utah (except for sensitivity to chilling going slightly down, which could be due to chill portions loosing some data with respect to Utah).

lizzieinvancouver commented 9 months ago

@MoralesCastilla Great! This has been a useful process, which got J and I thinking more about lambda and the model in general, but I am relieved the model is fairly stable to different chill units.