Closed pooranis closed 3 years ago
I noticed a discrepancy in the results of divnet when I use a phyloseq object vs just a plain matrix.
divnet
Using the data from your vignette:
library(phyloseq) library(breakaway) library(DivNet) library(magrittr) data(Lee) lee_phylum <- Lee %>% tax_glom(taxrank = "Phylum") %>% subset_samples(temp != "<NA>")
So, lee_phylum is a phyloseq object. If we use that as input to divnet with X = NULL (default):
X = NULL
divnet_phyloseq <- divnet(W= lee_phylum, base = 1) summary(divnet_phyloseq$shannon)
# A tibble: 14 x 7 estimate error lower upper sample_names name model <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> 1 1.82 0.0177 1.79 1.86 BW1 DivNet Aitchison 2 1.08 0.0328 1.02 1.15 BW2 DivNet Aitchison 3 1.12 0.0235 1.07 1.16 R10 DivNet Aitchison 4 1.01 0.0145 0.978 1.04 R11 DivNet Aitchison 5 0.435 0.0233 0.389 0.482 R11BF DivNet Aitchison 6 1.41 0.0171 1.38 1.44 R1A DivNet Aitchison 7 1.38 0.0142 1.35 1.41 R1B DivNet Aitchison 8 1.29 0.00983 1.27 1.31 R2 DivNet Aitchison 9 1.29 0.00654 1.27 1.30 R3 DivNet Aitchison 10 1.37 0.0124 1.34 1.39 R4 DivNet Aitchison 11 1.32 0.00981 1.31 1.34 R5 DivNet Aitchison 12 1.43 0.00805 1.41 1.45 R6 DivNet Aitchison 13 1.43 0.0152 1.40 1.46 R8 DivNet Aitchison 14 1.11 0.0155 1.08 1.14 R9 DivNet Aitchison
However, if we use the same otu table as a matrix:
mat <- t(otu_table(lee_phylum)@.Data) divnet_matrix <- divnet(W=mat, base = 1) summary(divnet_matrix$shannon)
# A tibble: 14 x 7 estimate error lower upper sample_names name model <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> 1 1.07 0.0761 0.913 1.22 BW1 DivNet Aitchison 2 1.07 0.0761 0.913 1.22 BW2 DivNet Aitchison 3 1.07 0.0761 0.913 1.22 R10 DivNet Aitchison 4 1.07 0.0761 0.913 1.22 R11 DivNet Aitchison 5 1.07 0.0761 0.913 1.22 R11BF DivNet Aitchison 6 1.07 0.0761 0.913 1.22 R1A DivNet Aitchison 7 1.07 0.0761 0.913 1.22 R1B DivNet Aitchison 8 1.07 0.0761 0.913 1.22 R2 DivNet Aitchison 9 1.07 0.0761 0.913 1.22 R3 DivNet Aitchison 10 1.07 0.0761 0.913 1.22 R4 DivNet Aitchison 11 1.07 0.0761 0.913 1.22 R5 DivNet Aitchison 12 1.07 0.0761 0.913 1.22 R6 DivNet Aitchison 13 1.07 0.0761 0.913 1.22 R8 DivNet Aitchison 14 1.07 0.0761 0.913 1.22 R9 DivNet Aitchison
^ This result appears to be incorrect, as all the estimates are the same (?). I traced the discrepancy back to how X is specified differently for a phyloseq object: https://github.com/adw96/DivNet/blob/master/R/divnet_main.R#L72 and how it is specified for a generic matrix: https://github.com/adw96/DivNet/blob/master/R/divnet_main.R#L91
X
If I use the phyloseq specification for X with the matrix, then I get the same result as with a phyloseq object:
xx <- row.names(mat) X <- model.matrix(~xx) divnet_matrix_X <- divnet(W=mat, X = X, base = 1) summary(divnet_matrix_X$shannon)
# A tibble: 14 x 7 estimate error lower upper sample_names name model <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> 1 1.82 0.0170 1.79 1.86 BW1 DivNet Aitchison 2 1.08 0.0466 0.988 1.17 BW2 DivNet Aitchison 3 1.12 0.00693 1.10 1.13 R10 DivNet Aitchison 4 1.01 0.0152 0.977 1.04 R11 DivNet Aitchison 5 0.435 0.00909 0.417 0.453 R11BF DivNet Aitchison 6 1.41 0.0197 1.37 1.45 R1A DivNet Aitchison 7 1.38 0.0147 1.35 1.41 R1B DivNet Aitchison 8 1.29 0.00968 1.27 1.31 R2 DivNet Aitchison 9 1.29 0.00927 1.27 1.31 R3 DivNet Aitchison 10 1.37 0.00703 1.36 1.38 R4 DivNet Aitchison 11 1.32 0.0103 1.30 1.35 R5 DivNet Aitchison 12 1.43 0.0205 1.39 1.47 R6 DivNet Aitchison 13 1.43 0.0129 1.40 1.45 R8 DivNet Aitchison 14 1.11 0.0202 1.07 1.15 R9 DivNet Aitchison
I assume, in practice, we would want to use the latter, and the current implementation is just a bug?
Thanks very much for your very useful research and R packages.
Hi @pooranis,
Thank you for pointing out this issue and the detailed bug report. This has been fixed in 1dbc3c2!
Bryan
I noticed a discrepancy in the results of
divnet
when I use a phyloseq object vs just a plain matrix.Using the data from your vignette:
So, lee_phylum is a phyloseq object. If we use that as input to
divnet
withX = NULL
(default):However, if we use the same otu table as a matrix:
^ This result appears to be incorrect, as all the estimates are the same (?). I traced the discrepancy back to how
X
is specified differently for a phyloseq object: https://github.com/adw96/DivNet/blob/master/R/divnet_main.R#L72 and how it is specified for a generic matrix: https://github.com/adw96/DivNet/blob/master/R/divnet_main.R#L91If I use the phyloseq specification for
X
with the matrix, then I get the same result as with a phyloseq object:I assume, in practice, we would want to use the latter, and the current implementation is just a bug?
Thanks very much for your very useful research and R packages.