FrederickHuangLin / ANCOMBC

Differential abundance (DA) and correlation analyses for microbial absolute abundance data
https://www.nature.com/articles/s41467-020-17041-7
107 stars 29 forks source link

Bug in data_core function #110

Closed TuomasBorman closed 2 years ago

TuomasBorman commented 2 years ago

https://github.com/FrederickHuangLin/ANCOMBC/blob/dbf4fb5c1434b551b087644836596a05f9660780/R/ancombc_prep.R#L98

Hello!

Here is a bug. altExp should be specified by altExp(tse, "something"). This line takes the first altExp available -->if tse already contains altExp, it gets the wrong one --> this usually leads to an error in line 115 (tax_keep do not match with taxa in altExp.)

This is fixed by adding parameter that specifies the altExp in line 98.

library(ANCOMBC)
data(atlas1006)
# subset to baseline
tse = atlas1006[, atlas1006$time == 0]

# Add subset to altExp
altExp(tse, "sub") <- tse[1:2, ]
# run ancombc function
set.seed(123)
out = ancombc2(data = tse, assay_name = "counts", tax_level = "Family",
              fix_formula = "age + nationality + bmi_group",
)

>
Error in feature_table[tax_keep, , drop = FALSE] : 
  subscript out of bounds
> 
Session info R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=fi_FI.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=fi_FI.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=fi_FI.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fi_FI.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] ANCOMBC_1.99.4 rmarkdown_2.17 viridis_0.6.2 viridisLite_0.4.1 [5] vegan_2.6-4 permute_0.9-7 scales_1.2.1 plotROC_2.3.0 [9] patchwork_1.1.2 ggrepel_0.9.1 ggplotify_0.1.0 ggord_1.1.7 [13] DT_0.26 dendextend_1.16.0 cutpointr_1.1.2 ComplexHeatmap_2.13.4 [17] circlize_0.4.15 caret_6.0-93 lattice_0.20-45 stringr_1.4.1 [21] sas7bdat_0.6 rlang_1.0.6 reshape2_1.4.4 readxl_1.4.1 [25] pheatmap_1.0.12 lubridate_1.8.0 knitr_1.40 gridExtra_2.3 [29] ggsignif_0.6.4 ggdendro_0.1.23 ggbeeswarm_0.6.0 dplyr_1.0.10 [33] BiocManager_1.30.19 scater_1.25.7 scuttle_1.7.4 microbiome_1.19.1 [37] phyloseq_1.41.0 bioDist_1.69.0 KernSmooth_2.23-20 miaViz_1.5.4 [41] ggraph_2.1.0 ggplot2_3.3.6 mia_1.5.17 TreeSummarizedExperiment_2.5.0 [45] Biostrings_2.65.6 XVector_0.37.1 SingleCellExperiment_1.19.1 MultiAssayExperiment_1.23.11 [49] SummarizedExperiment_1.27.3 Biobase_2.57.3 GenomicRanges_1.49.1 GenomeInfoDb_1.33.16 [53] IRanges_2.31.2 MatrixGenerics_1.9.1 matrixStats_0.62.0 S4Vectors_0.35.4 [57] BiocGenerics_0.43.4 loaded via a namespace (and not attached): [1] estimability_1.4.1 ModelMetrics_1.2.2.2 coda_0.19-4 tidyr_1.2.1 [5] bit64_4.0.5 multcomp_1.4-20 irlba_2.3.5.1 DelayedArray_0.23.2 [9] data.table_1.14.4 rpart_4.1.19 hardhat_1.2.0 RCurl_1.98-1.9 [13] doParallel_1.0.17 generics_0.1.3 ScaledMatrix_1.5.1 TH.data_1.1-1 [17] cowplot_1.1.1 RSQLite_2.2.18 proxy_0.4-27 future_1.28.0 [21] bit_4.0.4 assertthat_0.2.1 DirichletMultinomial_1.39.0 gower_1.0.0 [25] xfun_0.34 jquerylib_0.1.4 evaluate_0.17 fansi_1.0.3 [29] igraph_1.3.5 DBI_1.1.3 htmlwidgets_1.5.4 Rmpfr_0.8-9 [33] CVXR_1.0-11 purrr_0.3.5 ellipsis_0.3.2 crosstalk_1.2.0 [37] energy_1.7-10 backports_1.4.1 ggnewscale_0.4.8 deldir_1.0-6 [41] sparseMatrixStats_1.9.0 vctrs_0.5.0 Cairo_1.6-0 cachem_1.0.6 [45] withr_2.5.0 ggforce_0.4.1 emmeans_1.8.2 checkmate_2.1.0 [49] treeio_1.21.2 cluster_2.1.4 gsl_2.1-7.1 BiocBaseUtils_0.99.12 [53] ape_5.6-2 lazyeval_0.2.2 crayon_1.5.2 recipes_1.0.2 [57] pkgconfig_2.0.3 labeling_0.4.2 tweenr_2.0.2 nlme_3.1-160 [61] vipor_0.4.5 nnet_7.3-18 globals_0.16.1 lifecycle_1.0.3 [65] sandwich_3.0-2 rsvd_1.0.5 cellranger_1.1.0 polyclip_1.10-4 [69] rngtools_1.5.2 Matrix_1.5-1 aplot_0.1.8 zoo_1.8-11 [73] Rhdf5lib_1.19.2 boot_1.3-28 base64enc_0.1-3 beeswarm_0.4.0 [77] GlobalOptions_0.1.2 png_0.1-7 rjson_0.2.21 rootSolve_1.8.2.3 [81] bitops_1.0-7 rhdf5filters_1.9.0 pROC_1.18.0 blob_1.2.3 [85] DelayedMatrixStats_1.19.2 doRNG_1.8.2 shape_1.4.6 decontam_1.17.0 [89] parallelly_1.32.1 jpeg_0.1-9 gridGraphics_0.5-1 DECIPHER_2.25.4 [93] beachmat_2.13.4 memoise_2.0.1 magrittr_2.0.3 plyr_1.8.7 [97] zlibbioc_1.43.0 compiler_4.2.1 RColorBrewer_1.1-3 lme4_1.1-30 [101] clue_0.3-62 cli_3.4.1 ade4_1.7-19 lmerTest_3.1-3 [105] listenv_0.8.0 htmlTable_2.4.1 Formula_1.2-4 MASS_7.3-58.1 [109] mgcv_1.8-41 tidyselect_1.2.0 stringi_1.7.8 highr_0.9 [113] yaml_2.3.6 BiocSingular_1.13.1 latticeExtra_0.6-30 sass_0.4.2 [117] tools_4.2.1 lmom_2.9 future.apply_1.9.1 parallel_4.2.1 [121] rstudioapi_0.14 foreign_0.8-83 foreach_1.5.2 gld_2.6.6 [125] prodlim_2019.11.13 farver_2.1.1 Rtsne_0.16 digest_0.6.30 [129] lava_1.7.0 Rcpp_1.0.9 httr_1.4.4 Rdpack_2.4 [133] colorspace_2.0-3 splines_4.2.1 yulab.utils_0.0.5 tidytree_0.4.1 [137] expm_0.999-6 graphlayouts_0.8.3 xgboost_1.6.0.1 multtest_2.53.0 [141] Exact_3.2 xtable_1.8-4 gmp_0.6-7 nloptr_2.0.3 [145] jsonlite_1.8.3 ggtree_3.5.3 tidygraph_1.2.2 timeDate_4021.106 [149] ggfun_0.0.7 ipred_0.9-13 R6_2.5.1 Hmisc_4.7-1 [153] pillar_1.8.1 htmltools_0.5.3 minqa_1.2.5 glue_1.6.2 [157] fastmap_1.1.0 BiocParallel_1.31.15 BiocNeighbors_1.15.1 class_7.3-20 [161] codetools_0.2-18 mvtnorm_1.1-3 utf8_1.2.2 bslib_0.4.0 [165] tibble_3.1.8 numDeriv_2016.8-1.1 DescTools_0.99.47 interp_1.1-3 [169] survival_3.4-0 biomformat_1.25.0 munsell_0.5.0 e1071_1.7-12 [173] GetoptLong_1.0.5 rhdf5_2.41.2 GenomeInfoDbData_1.2.9 iterators_1.0.14 [177] gtable_0.3.1 rbibutils_2.2.9

BR, Tuomas

FrederickHuangLin commented 2 years ago

Hi @TuomasBorman,

Thanks for catching the BUG! I really appreciate it!

I just pushed a bug-fix commit. I hope the new version (devel version 2.1.1, release version 2.0.1 on Bioconductor 3.16, should be available after 2 business days) of ancom and ancombc2 could solve the problem.

Let me know how it works!

Best, Huang