paternogbc / sensiPhy

R package to perform sensitivity analysis for comparative methods
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12990/full
GNU General Public License v2.0
12 stars 4 forks source link

intra_clade_phylm #157

Closed paternogbc closed 7 years ago

paternogbc commented 7 years ago

Hey @caterinap ,

I am implementing the plots for intra_clade_xxx but I think there is something wierd with slope estimates.

Because when running the simple intra_phylm () with the same data and formula, the estimates values are very different. You can see that with summary. Any ideas? (see below)

#load data
library(sensiPhy)
#> Loading required package: ape
#> Loading required package: phylolm
#> Loading required package: ggplot2
data(alien)
intra_clade <- intra_clade_phylm(gestaLen ~ adultMass, phy = alien$phy[[1]], data = alien$data, 
                                 clade.col = "family", n.sim = 5, track = F,
                                 n.intra = 30, y.transf = log, Vy="SD_gesta")
#> Warning in intra_clade_phylm(gestaLen ~ adultMass, phy = alien$phy[[1]], :
#> distrib=normal: make sure that standard deviation is provided for Vx and/or
#> Vy
#> Warning in match_dataphy(formula, data, phy): NA's in response or
#> predictor, rows with NA's were removed
#> Warning in match_dataphy(formula, data, phy): Some phylo tips do not match
#> species in data (this can be due to NA removal) species were dropped from
#> phylogeny or data
#> Used dataset has  84  species that match data and phylogeny

summary(intra_clade)
#> $Estimate
#>   clade.removed N.species     estimate   DIFestimate Change (%)
#> 2      Cervidae        12 0.0002470599  3.602129e-05       17.1
#> 1       Bovidae        15 0.0002421197  3.108100e-05       14.7
#> 4    Mustelidae         6 0.0002108095 -2.291126e-07        0.1
#> 3  Macropodidae         7 0.0002110699  3.123265e-08        0.0
#>           Pval m.null.estimate Pval.randomization Non random (%)
#> 2 3.491288e-12    0.0002118292         0.03333333           83.3
#> 1 3.367617e-07    0.0002157617         0.10666667           56.7
#> 4 1.168190e-11    0.0002121003         0.41333333            3.3
#> 3 2.462271e-11    0.0002098962         0.46666667            0.0
#> 
#> $Intercept
#>   clade.removed N.species intercept DIFintercept Change (%)         Pval
#> 2      Cervidae        12  52.52486  -0.97205199        1.8 3.491288e-12
#> 1       Bovidae        15  53.21127  -0.28564430        0.5 3.367617e-07
#> 4    Mustelidae         6  53.68599   0.18907236        0.4 1.168190e-11
#> 3  Macropodidae         7  53.43314  -0.06377231        0.1 2.462271e-11
#>   m.null.intercept Pval.randomization Non random (%)
#> 2         54.15938          0.2400000           16.7
#> 1         54.22764          0.3466667           10.0
#> 4         53.63479          0.5400000            3.3
#> 3         53.78446          0.3533333           10.0

Single intra_phylm()

library(sensiPhy)
#> Loading required package: ape
#> Loading required package: phylolm
#> Loading required package: ggplot2
data(alien)
# run PGLS accounting for intraspecific variation:
intra <- intra_phylm(gestaLen ~ adultMass, y.transf = log, x.transf = log, 
                     phy = alien$phy[[1]], data = alien$data, Vy = "SD_gesta", n.intra = 30)
#> Warning in intra_phylm(gestaLen ~ adultMass, y.transf = log, x.transf = log, : distrib=normal: make sure that standard deviation 
#>                                  is provided for Vx and/or Vy
#> Warning in match_dataphy(formula, data, phy, ...): NA's in response or
#> predictor, rows with NA's were removed
#> Warning in match_dataphy(formula, data, phy, ...): Some phylo tips do not
#> match species in data (this can be due to NA removal) species were dropped
#> from phylogeny or data
#> Used dataset has  84  species that match data and phylogeny
#> ===========================================================================
#> Warning in intra_phylm(gestaLen ~ adultMass, y.transf = log, x.transf = log, : in 1 simulations, data transformations generated NAs, please consider using another function
#>   for x.transf or y.transf and check output$sp.pb

# To check summary results:
summary(intra)
#>                 mean CI_low CI_high
#> intercept      2.254  2.222   2.286
#> se.intercept   0.368  0.360   0.376
#> pval.intercept 0.000  0.000   0.000
#> estimate       0.162  0.158   0.166
#> se.estimate    0.024  0.023   0.024
#> pval.estimate  0.000  0.000   0.000
paternogbc commented 7 years ago

Found the problem in this bit of code:

 intra.clade[[j]] <- clade_phylm(formula, data=full.data, phy=phy, model=model,
                                    clade.col=clade.col, n.species=n.species, n.sim=n.sim, 
                                    track = FALSE, verbose = FALSE, ...)

formula should be re-set to respV ~ predV otherwise it would just run the model with the raw data.

Fix:

 intra.clade[[j]] <- clade_phylm(formula = respV ~ predV, data=full.data, phy=phy, model=model,
                                    clade.col=clade.col, n.species=n.species, n.sim=n.sim, 
                                    track = FALSE, verbose = FALSE, ...)

However, I would appreciate if you could double check this issue for me =)

caterinap commented 7 years ago

Wow!!! Well spotted! Sorry about that. Your solution is correct. I checked all the other intra functions and this is the only one with this problem.

paternogbc commented 7 years ago

Perfect @caterinap ! Thanks.