FrederickHuangLin / ANCOMBC

Differential abundance (DA) and correlation analyses for microbial absolute abundance data
https://www.nature.com/articles/s41467-020-17041-7
107 stars 29 forks source link

Error in tutorial example #200

Closed ManuelaMiguel closed 1 year ago

ManuelaMiguel commented 1 year ago

Hi, I have been trying to run the tutorial example for ANCOMBC2 (website: https://bioconductor.org/packages/release/bioc/vignettes/ANCOMBC/inst/doc/ANCOMBC2.html); however, the following error has been released after ancombc2 function:

_"Error in model.matrix.formula(formula(paste0("~", fix_formula)), data = metadata) : data must be a data.frame"

I have already tried to remove and reinstall the package again.

Thank you!

gabrielet commented 1 year ago

@ManuelaMiguel were you running the example on your data or on the example data? It looks like you're using your own data and i was wondering what meta_data is. Can you provide some more info?

saracg27 commented 1 year ago

Hi,

I got the same error. I tried to follow the tutorial mentioned by @ManuelaMiguel, but on my own data. I do not have any object in the environment called "meta_data". I am working with a tse created from a phyloseq object:

class: TreeSummarizedExperiment dim: 2948 190 metadata(0): assays(1): counts rownames(2948): 7 9 ... 4162 4164 rowData names(6): Domain Phylum ... Family Genus colnames(190): 1-Mesocosm-1-1-Exp1-D-0-Sed 10-Mesocosm-5-2-Exp1-D-0-Sed ... 99-Mesocosm-2-1-Exp1-TEST-Rhizo 99-Mesocosm-2-1-Exp1-TEST-Sed colData names(8): alias Compartment ... Sampling_date treatment reducedDimNames(0): mainExpName: NULL altExpNames(0): rowLinks: NULL rowTree: NULL colLinks: NULL colTree: NULL

the ancombc2 function parameters are as follows:

output <- ancombc2(data = tse,
assay_name = "counts", fix_formula = " Water_type + Sample_type + Time", # Character rand_formula = NULL, p_adj_method = "holm", prv_cut = 0.10, lib_cut = 0, s0_perc = 0.05, group = "Time", struc_zero = TRUE, neg_lb = TRUE, alpha = 0.05, n_cl = 1, )

error message:

tax_level is not speficified No agglomeration will be performed Otherwise, please speficy tax_level by one of the following: Domain, Phylum, Class, Order, Family, Genus Error in model.matrix.formula(formula(paste0("~", fix_formula)), data = meta_data) : data must be a data.frame

Same happens if I work with the phyloseq object instead, where my sam_data (metadata) is a data.frame.

Any ideas or help is welcome 🙏

saracg27 commented 1 year ago

**** EDIT ***

The example in Ancombc2 seems to work fine in a FRESH session. So I suppose it is a compatibility problem with other packages.

R version 4.2.2 (2022-10-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.6.8

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ANCOMBC_2.0.3 dplyr_1.1.2 phyloseq_1.42.0

loaded via a namespace (and not attached): [1] readxl_1.4.3 backports_1.4.1 Hmisc_5.1-0
[4] plyr_1.8.8 igraph_1.5.1 lazyeval_0.2.2
[7] splines_4.2.2 gmp_0.7-2 BiocParallel_1.32.6
[10] TH.data_1.1-2 GenomeInfoDb_1.34.9 ggplot2_3.4.3
[13] scater_1.26.1 digest_0.6.33 foreach_1.5.2
[16] yulab.utils_0.0.8 htmltools_0.5.6 viridis_0.6.4
[19] lmerTest_3.1-3 fansi_1.0.4 magrittr_2.0.3
[22] checkmate_2.2.0 memoise_2.0.1 ScaledMatrix_1.6.0
[25] doParallel_1.0.17 cluster_2.1.4 DECIPHER_2.26.0
[28] Biostrings_2.66.0 matrixStats_1.0.0 sandwich_3.0-2
[31] colorspace_2.1-0 blob_1.2.4 ggrepel_0.9.3
[34] rbibutils_2.2.15 xfun_0.40 crayon_1.5.2
[37] RCurl_1.98-1.12 jsonlite_1.8.7 lme4_1.1-34
[40] Exact_3.2 zoo_1.8-12 survival_3.5-7
[43] iterators_1.0.14 ape_5.7-1 glue_1.6.2
[46] gtable_0.3.4 emmeans_1.8.8 zlibbioc_1.44.0
[49] XVector_0.38.0 DelayedArray_0.24.0 BiocSingular_1.14.0
[52] Rhdf5lib_1.20.0 SingleCellExperiment_1.20.1 Rmpfr_0.9-3
[55] BiocGenerics_0.44.0 scales_1.2.1 mvtnorm_1.2-2
[58] rngtools_1.5.2 DBI_1.1.3 Rcpp_1.0.11
[61] xtable_1.8-4 viridisLite_0.4.2 htmlTable_2.4.1
[64] decontam_1.18.0 tidytree_0.4.5 foreign_0.8-84
[67] bit_4.0.5 rsvd_1.0.5 proxy_0.4-27
[70] Formula_1.2-5 stats4_4.2.2 CVXR_1.0-11
[73] htmlwidgets_1.6.2 httr_1.4.7 pkgconfig_2.0.3
[76] scuttle_1.8.4 nnet_7.3-19 utf8_1.2.3
[79] tidyselect_1.2.0 rlang_1.1.1 reshape2_1.4.4
[82] munsell_0.5.0 cellranger_1.1.0 tools_4.2.2
[85] cachem_1.0.8 cli_3.6.1 DirichletMultinomial_1.40.0
[88] generics_0.1.3 RSQLite_2.3.1 mia_1.6.0
[91] ade4_1.7-22 evaluate_0.21 biomformat_1.26.0
[94] stringr_1.5.0 fastmap_1.1.1 knitr_1.43
[97] bit64_4.0.5 purrr_1.0.2 rootSolve_1.8.2.3
[100] doRNG_1.8.6 nlme_3.1-163 sparseMatrixStats_1.10.0
[103] compiler_4.2.2 rstudioapi_0.15.0 beeswarm_0.4.0
[106] e1071_1.7-13 treeio_1.23.0 tibble_3.2.1
[109] gsl_2.1-8 DescTools_0.99.49 stringi_1.7.12
[112] lattice_0.21-8 Matrix_1.5-4.1 nloptr_2.0.3
[115] vegan_2.6-4 permute_0.9-7 multtest_2.54.0
[118] vctrs_0.6.3 pillar_1.9.0 lifecycle_1.0.3
[121] rhdf5filters_1.10.1 Rdpack_2.5 estimability_1.4.1
[124] BiocNeighbors_1.16.0 data.table_1.14.8 bitops_1.0-7
[127] irlba_2.3.5.1 lmom_2.9 GenomicRanges_1.50.2
[130] R6_2.5.1 gridExtra_2.3 vipor_0.4.5
[133] IRanges_2.32.0 gld_2.6.6 codetools_0.2-19
[136] energy_1.7-11 boot_1.3-28.1 MASS_7.3-60
[139] TreeSummarizedExperiment_2.6.0 rhdf5_2.42.1 SummarizedExperiment_1.28.0
[142] withr_2.5.0 multcomp_1.4-25 S4Vectors_0.36.2
[145] GenomeInfoDbData_1.2.9 mgcv_1.9-0 expm_0.999-7
[148] parallel_4.2.2 MultiAssayExperiment_1.24.0 grid_4.2.2
[151] rpart_4.1.19 beachmat_2.14.2 minqa_1.2.5
[154] coda_0.19-4 tidyr_1.3.0 class_7.3-22
[157] rmarkdown_2.24 DelayedMatrixStats_1.20.0 MatrixGenerics_1.10.0
[160] numDeriv_2016.8-1.1 Biobase_2.58.0 base64enc_0.1-3
[163] ggbeeswarm_0.7.2

* ORIGINAL MESSAGE *****

I tried to reproduce the example given in the tutorial (in my NOT FRESH session) and I get the same error: Error in model.matrix.formula(formula(paste0("~", fix_formula)), data = meta_data) : data must be a data.frame.

sessionInfo()

R version 4.2.2 (2022-10-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.6.8

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ANCOMBC_2.0.3 vegan_2.6-4 lattice_0.21-8 permute_0.9-7
[5] roperators_1.3.14 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[9] dplyr_1.1.2 purrr_1.0.2 readr_2.1.4 tidyr_1.3.0
[13] tibble_3.2.1 tidyverse_2.0.0 psych_2.3.6 MASS_7.3-60
[17] labdsv_2.1-0 mgcv_1.9-0 nlme_3.1-163 here_1.0.1
[21] ggtext_0.1.2 ggpubr_0.6.0 ggforce_0.4.1 ggplot2_3.4.3
[25] ggdist_3.3.0 glue_1.6.2 dunn.test_1.3.5 DESeq2_1.38.3
[29] SummarizedExperiment_1.28.0 Biobase_2.58.0 MatrixGenerics_1.10.0 matrixStats_1.0.0
[33] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9 IRanges_2.32.0 S4Vectors_0.36.2
[37] BiocGenerics_0.44.0 cowplot_1.1.1 colorspace_2.1-0 agricolae_1.3-6
[41] pacman_0.5.1 phyloseq_1.42.0

loaded via a namespace (and not attached): [1] estimability_1.4.1 coda_0.19-4 bit64_4.0.5
[4] knitr_1.43 multcomp_1.4-25 irlba_2.3.5.1
[7] DelayedArray_0.24.0 data.table_1.14.8 rpart_4.1.19
[10] doParallel_1.0.17 KEGGREST_1.38.0 RCurl_1.98-1.12
[13] generics_0.1.3 ScaledMatrix_1.6.0 TH.data_1.1-2
[16] RSQLite_2.3.1 combinat_0.0-8 proxy_0.4-27
[19] bit_4.0.5 tzdb_0.4.0 xml2_1.3.5
[22] httpuv_1.6.11 DirichletMultinomial_1.40.0 viridis_0.6.4
[25] xfun_0.40 hms_1.1.3 evaluate_0.21
[28] promises_1.2.1 fansi_1.0.4 readxl_1.4.3
[31] mia_1.6.0 igraph_1.5.1 DBI_1.1.3
[34] geneplotter_1.76.0 htmlwidgets_1.6.2 Rmpfr_0.9-3
[37] CVXR_1.0-11 ellipsis_0.3.2 energy_1.7-11
[40] backports_1.4.1 annotate_1.76.0 sparseMatrixStats_1.10.0
[43] vctrs_0.6.3 SingleCellExperiment_1.20.1 abind_1.4-5
[46] cachem_1.0.8 withr_2.5.0 emmeans_1.8.8
[49] checkmate_2.2.0 treeio_1.23.0 MultiAssayExperiment_1.24.0
[52] mnormt_2.1.1 cluster_2.1.4 gsl_2.1-8
[55] ape_5.7-1 lazyeval_0.2.2 crayon_1.5.2
[58] TreeSummarizedExperiment_2.6.0 pkgconfig_2.0.3 tweenr_2.0.2
[61] vipor_0.4.5 nnet_7.3-19 rlang_1.1.1
[64] questionr_0.7.8 lifecycle_1.0.3 miniUI_0.1.1.1
[67] sandwich_3.0-2 rsvd_1.0.5 cellranger_1.1.0
[70] distributional_0.3.2 rprojroot_2.0.3 polyclip_1.10-4
[73] rngtools_1.5.2 Matrix_1.5-4.1 carData_3.0-5
[76] zoo_1.8-12 Rhdf5lib_1.20.0 boot_1.3-28.1
[79] base64enc_0.1-3 beeswarm_0.4.0 png_0.1-8
[82] viridisLite_0.4.2 rootSolve_1.8.2.3 bitops_1.0-7
[85] rhdf5filters_1.10.1 Biostrings_2.66.0 blob_1.2.4
[88] DelayedMatrixStats_1.20.0 doRNG_1.8.6 decontam_1.18.0
[91] rstatix_0.7.2 DECIPHER_2.26.0 ggsignif_0.6.4
[94] klaR_1.7-2 beachmat_2.14.2 scales_1.2.1
[97] memoise_2.0.1 magrittr_2.0.3 plyr_1.8.8
[100] zlibbioc_1.44.0 compiler_4.2.2 RColorBrewer_1.1-3
[103] lme4_1.1-34 cli_3.6.1 ade4_1.7-22
[106] XVector_0.38.0 lmerTest_3.1-3 htmlTable_2.4.1
[109] Formula_1.2-5 tidyselect_1.2.0 stringi_1.7.12
[112] highr_0.10 BiocSingular_1.14.0 locfit_1.5-9.8
[115] ggrepel_0.9.3 grid_4.2.2 tools_4.2.2
[118] lmom_2.9 timechange_0.2.0 parallel_4.2.2
[121] rstudioapi_0.15.0 foreach_1.5.2 foreign_0.8-84
[124] gridExtra_2.3 gld_2.6.6 farver_2.1.1
[127] Rtsne_0.16 digest_0.6.33 shiny_1.7.5
[130] Rcpp_1.0.11 gridtext_0.1.5 car_3.1-2
[133] broom_1.0.5 scuttle_1.8.4 later_1.3.1
[136] httr_1.4.7 AnnotationDbi_1.60.2 Rdpack_2.5
[139] XML_3.99-0.14 splines_4.2.2 yulab.utils_0.0.8
[142] tidytree_0.4.5 expm_0.999-7 scater_1.26.1
[145] multtest_2.54.0 Exact_3.2 xtable_1.8-4
[148] gmp_0.7-2 jsonlite_1.8.7 nloptr_2.0.3
[151] AlgDesign_1.2.1 R6_2.5.1 Hmisc_5.1-0
[154] pillar_1.9.0 htmltools_0.5.6 mime_0.12
[157] fastmap_1.1.1 minqa_1.2.5 BiocParallel_1.32.6
[160] BiocNeighbors_1.16.0 class_7.3-22 codetools_0.2-19
[163] mvtnorm_1.2-2 utf8_1.2.3 numDeriv_2016.8-1.1
[166] ggbeeswarm_0.7.2 DescTools_0.99.49 survival_3.5-7
[169] rmarkdown_2.24 biomformat_1.26.0 munsell_0.5.0
[172] e1071_1.7-13 rhdf5_2.42.1 GenomeInfoDbData_1.2.9
[175] iterators_1.0.14 labelled_2.12.0 haven_2.5.3
[178] reshape2_1.4.4 gtable_0.3.4 rbibutils_2.2.15

FrederickHuangLin commented 1 year ago

Hi @ManuelaMiguel, @gabrielet, and @saracg27,

Thanks for your feedback! @ManuelaMiguel, may you confirm if you used the tutorial data or your own data? @saracg27, I will take a look at the data import function in the next update.

Thank you all! Huang

ManuelaMiguel commented 1 year ago

Hi Huang, Thank you for your response. I was using the tutorial data when the error occurred.

I observed that if I call the package "agricolae" together with "ANCOMBC2" during my microbiome analysis this error occurs. If I do not use the packages together the ANCOMBC2 runs well.

Thank you!

FrederickHuangLin commented 1 year ago

Thank you for your response, @ManuelaMiguel ! I'll certainly investigate this package compatibility issue. If it's not too much trouble, could you kindly share the code that triggered the error?

Best regards, Huang