bomeara / BMhyb

0 stars 1 forks source link

Several hybridization events #15

Closed pbastide closed 6 years ago

pbastide commented 6 years ago

Variance between hybrid descendants

Hi again, @bomeara and @djhwueng

This might be related to #13 and #14.

I tried a network with several hybridization events:

gamma1 <- 0.5; gamma2 <- 0.5;
## Underlying tree
t1 <- 0.2; t2 <- 0.2; t3 <- 0.2; t4 <- 0.2; t5 <- 0.2;
phy <- read.tree(text = paste0("(((R:",t4+t5,",Y:",t4+t5,"):",t3,",X:",t3+t4+t5,"):",t1+t2,",Z:",t1+t2+t3+t4+t5,");"))
plot(phy)
## Network
don_recp <- rbind(expand.grid(c("Z"), c("Y", "R", "X")), 
                  expand.grid(c("X"), c("R")))
network <- list(phy = phy,
                flow = data.frame(donor = don_recp[,1],
                                  recipient = don_recp[,2],
                                  gamma = c(rep(gamma1, 3), gamma2),
                                  time.from.root.donor = c(rep(t1, 3), t1+t2+t3+t4),
                                  time.from.root.recipient = c(rep(t1, 3), t1+t2+t3+t4)))
network$flow$donor <- as.character(network$flow$donor)
network$flow$recipient <- as.character(network$flow$recipient)
## Plot
PlotNetwork(network$phy, network$flow)
axis(1, at = c(0, t1, t1+t2, t1+t2+t3, t1+t2+t3+t4, t1+t2+t3+t4+t5), 
     labels = c("0", "t1", "t1+t2", "t1+t2+t3", "t1+t2+t3+t4", "t1+t2+t3+t4+t5"))

This gives the folowing variance matrix:

> sigma2 = 1
> x <- c(sigma.sq = sigma2, mu = 0, SE = 0)
> actual.params <- c("sigma.sq", "mu", "bt", "vh", "SE")

> GetVModified(x, network$phy, network$flow, actual.params)
   R   Y   X   Z
R 0.8 0.6 0.6 0.1
Y 0.6 0.9 0.4 0.1
X 0.6 0.4 0.9 0.1
Z 0.1 0.1 0.1 1.0

I think that the variance of R is not coherent with the model of trait evolution. If my computations are correct, we should have:

Var[R] = (gamma2^2 + (1-gamma2)^2)*((gamma1^2 + (1-gamma1)^2)*t1+t2+t3+t4) 
          + 2*gamma2*(1-gamma2)*((gamma1^2 + (1-gamma1)^2)*t1+t2) + t5     = 0.7 \neq 0.8

(Note that the covariances between R and Y and X might also have problems, see #14).

Browsing through the code, this might be linked with the fact that a new hybridization "erases" an older one in your algorithm. Indeed, all the computations are made using V.original, that do not take ancestral hybrids into account. Here, if there were only one hybridization (the second one), then we would have:

Var[R] = (gamma2^2 + (1-gamma2)^2)*(t1+t2+t3+t4) + 2*gamma2*(1-gamma2)*(t1+t2) + t5 = 0.8

which is the result given by GetVModified.

I think this is a seperate problem from the two other ones, hence the new issue. Again, I'm sorry if I mis-used your functions or made mistakes, please correct me if I did.

Thanks !

Session infos:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
[1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C               LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
[5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8    LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BMhyb_1.5.1 ape_4.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12            subplex_1.4-1           msm_1.6.4               mvtnorm_1.0-6          
 [5] lattice_0.20-35         tidyr_0.7.1             corpcor_1.6.9           prettyunits_1.0.2      
 [9] assertthat_0.2.0        digest_0.6.12           foreach_1.4.3           R6_2.2.2               
[13] plyr_1.8.4              phytools_0.6-20         coda_0.19-1             httr_1.3.1             
[17] ggplot2_2.2.1           progress_1.1.2          rlang_0.1.2.9000        uuid_0.1-2             
[21] lazyeval_0.2.0          curl_2.8.1              data.table_1.10.4       taxize_0.9.0           
[25] phangorn_2.2.0          Matrix_1.2-11           RNeXML_2.0.7            combinat_0.0-8         
[29] splines_3.4.2           stringr_1.2.0           igraph_1.1.2            munsell_0.4.3          
[33] compiler_3.4.2          numDeriv_2016.8-1       geiger_2.0.6            pkgconfig_2.0.1        
[37] mnormt_1.5-5            tibble_1.3.4            gridExtra_2.2.1         TreeSim_2.3            
[41] expm_0.999-2            quadprog_1.5-5          codetools_0.2-15        XML_3.98-1.9           
[45] reshape_0.8.7           viridisLite_0.2.0       dplyr_0.7.2             MASS_7.3-47            
[49] crul_0.3.8              grid_3.4.2              nlme_3.1-131            jsonlite_1.5           
[53] gtable_0.2.0            magrittr_1.5            scales_0.5.0            stringi_1.1.5          
[57] reshape2_1.4.2          viridis_0.4.0           bindrcpp_0.2            scatterplot3d_0.3-40   
[61] phylobase_0.8.4         xml2_1.1.1              fastmatch_1.1-0         deSolve_1.20           
[65] iterators_1.0.8         tools_3.4.2             rncl_0.8.2              ade4_1.7-8             
[69] bold_0.5.0              glue_1.1.1              purrr_0.2.3             maps_3.2.0             
[73] plotrix_3.6-6           parallel_3.4.2          survival_2.41-3         colorspace_1.3-2       
[77] bindr_0.1               animation_2.5           clusterGeneration_1.3.4
bomeara commented 6 years ago

We've decided to make it so that every lineage may have no more than one hybridization event in its past. We've added checks to verify that the user's flow diagram doesn't violate this.

bomeara commented 6 years ago

Ok, new code now passes this check