Closed ivanllampy closed 2 years ago
Hi,
Thanks for your question. It's hard to say exactly what is happening in each of these instances without seeing what code you are attempting to run and having some information about the dataset you are using. With that said, here are a few thoughts:
I hope this is somewhat helpful – happy to follow up if you have more questions!
Best, David
Hi all,
Thank you for your reply, and sorry for my late response. I have been trying to manipulate the code these days with different parameters, and able to get rid of the error. Although not sure what happened at the beginning, it generally works now for breakaway() and betta()!
For chao_bunge(), another outcome occurs - not every sample has an estimation, for example:
$Sample LI04 Estimate of richness from method Chao-Bunge: Estimate is NA with standard error NA Confidence interval: (NA, NA) Cutoff: 10
Here is my workflow:
I create a phyloseq object with ASV table, metadata file, and taxonomy table.
Then, I subset the phyloseq with the groups of sample I interested.
Finally, I simply run chao_bunge() to the phyloseq object.
Here is the code I used:
physeq<-phyloseq(asv.table,taxa.table,sam.table) #Create an phyloseq object
physeq_rm900 <- phyloseq::subset_samples(physeq, phyloseq::sample_sums(physeq) > 900) #Remove samples with number of read count < 900
fg_physeq <- subset_samples(physeq_rm900, type %in% c("f","si","li")) #Subset three groups, namely "f", "si", and "li"
cb<-chao_bunge(fg_physeq, cutoff = 10, output = NULL, answers = NULL) #Run chao_bunge()
May I ask why would the estimate be "NA"? Should I switch to another type of estimates, or it is a problem of my dataset? Many thanks for your reply again, any help would be appreciated!
Hi,
Sorry for the long delay! chao_bunge
can return NA values if estimated values of diversity are negative or if they are undefined (which can happen if a computed denominator term is zero). In each of these cases, you should receive a warning message – have you seen any messages warning you about output of this function?
Thanks!
David
You may also have better luck with the breakaway
function than with chao_bunge
– breakaway
will fit a different model if the initial model it tries to fit is invalid or gives nonsensical results, so you are more likely to get usable output.
I will second the excellent answer of @ailurophilia here -- the Chao-Bunge estimator is based on a negative binomial model, which can be very unstable and produce negative estimates of the number of unobserved species (!), especially when latent diversity is high. This is part of the motivation for writing the function breakaway
, which sequentially steps through models as more complex models fail.
As this isn't a code bug, I'm going to close this issue, @ivanllampy. We will do our best to help if you have more questions.
Hello @adw96 and @ailurophilia, what cutoff values you would recommend, or how to know which to use?
I am going through the stamps2022 diversity lab and I am also getting warnings (50 of them) that my cutoff is too low. It seems the default value is NA, so I'm not sure where to start. I tried supplying Inf, but the errors remain. Of particular interest, the plot grid that compares the observed phyloseq with the ba look exactly the same (except for the y axis minimum) :
Here's the code I ran :
psObject <- psProvSM
observed_phyloseq <- sample_richness(psObject)
plot(observed_phyloseq, psObject, color = "compart")
ba <- breakaway(psObject, cutoff=Inf)
p1 <- plot(ba, psObject, color = "compart")
p2 <- plot(observed_phyloseq, psObject, color = "compart")
grid.arrange(p2, p1, nrow = 2)
Edit: sequencing depth are quite different between my two compartments, so I would expect there to be a difference when estimating using breakaway:
Thanks in advance for your help !
Sorry, first time here!
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] gridExtra_2.3 janitor_2.1.0 corncob_0.3.0
[4] wesanderson_0.3.6 RColorBrewer_1.1-3 DESeq2_1.36.0
[7] SummarizedExperiment_1.26.1 Biobase_2.56.0 MatrixGenerics_1.8.1
[10] matrixStats_0.62.0 GenomicRanges_1.48.0 GenomeInfoDb_1.32.4
[13] IRanges_2.30.1 S4Vectors_0.34.0 DivNet_0.4.0
[16] breakaway_4.7.9 BiocGenerics_0.42.0 vegan_2.6-2
[19] lattice_0.20-45 permute_0.9-7 lubridate_1.8.0
[22] trust_0.1-8 forcats_0.5.2 dplyr_1.0.10
[25] purrr_0.3.5 ggplot2_3.3.6 tidyverse_1.3.2
[28] tidyr_1.2.1.9000 stringr_1.4.1 readr_2.1.3
[31] tibble_3.1.8 optimr_2019-12.16 magrittr_2.0.3
[34] kableExtra_1.3.4 data.table_1.14.2 phyloseq_1.40.0
loaded via a namespace (and not attached):
[1] utf8_1.2.2 ROI.plugin.lpsolve_1.0-1 tidyselect_1.1.2 lme4_1.1-30
[5] htmlwidgets_1.5.4 RSQLite_2.2.18 AnnotationDbi_1.58.0 grid_4.2.1
[9] BiocParallel_1.30.3 Rtsne_0.16 munsell_0.5.0 codetools_0.2-18
[13] miniUI_0.1.1.1 withr_2.5.0 colorspace_2.0-3 lpSolveAPI_5.5.2.0-17.8
[17] knitr_1.40 rstudioapi_0.14 setRNG_2022.4-1 labeling_0.4.2
[21] slam_0.1-50 GenomeInfoDbData_1.2.8 bit64_4.0.5 farver_2.1.1
[25] rhdf5_2.40.0 rprojroot_2.0.3 vctrs_0.4.2 generics_0.1.3
[29] xfun_0.33 R6_2.5.1 doParallel_1.0.17 VGAM_1.1-7
[33] locfit_1.5-9.6 bitops_1.0-7 rhdf5filters_1.8.0 microbiome_1.18.0
[37] cachem_1.0.6 DelayedArray_0.22.0 assertthat_0.2.1 promises_1.2.0.1
[41] scales_1.2.1 vroom_1.6.0 googlesheets4_1.0.1 gtable_0.3.1
[45] processx_3.7.0 rlang_1.0.6 genefilter_1.78.0 systemfonts_1.0.4
[49] splines_4.2.1 gargle_1.2.1 Rcgmin_2022-4.30 broom_1.0.1
[53] BiocManager_1.30.18 reshape2_1.4.4 abind_1.4-5 modelr_0.1.9
[57] backports_1.4.1 httpuv_1.6.6 usethis_2.1.6 optextras_2019-12.4
[61] tools_4.2.1 ellipsis_0.3.2 biomformat_1.24.0 sessioninfo_1.2.2
[65] Rcpp_1.0.9 plyr_1.8.7 zlibbioc_1.42.0 RCurl_1.98-1.9
[69] prettyunits_1.1.1 ps_1.7.1 urlchecker_1.0.1 ROI_1.0-0
[73] haven_2.5.1 cluster_2.1.4 fs_1.5.2 Rvmmin_2018-4.17.1
[77] reprex_2.0.2 mvnfast_0.2.7 googledrive_2.0.0 pkgload_1.3.0
[81] mime_0.12 hms_1.1.2 evaluate_0.17 xtable_1.8-4
[85] XML_3.99-0.11 readxl_1.4.1 compiler_4.2.1 crayon_1.5.2
[89] minqa_1.2.4 htmltools_0.5.3 later_1.3.0 mgcv_1.8-40
[93] tzdb_0.3.0 geneplotter_1.74.0 DBI_1.1.3 dbplyr_2.2.1
[97] MASS_7.3-58.1 boot_1.3-28 Matrix_1.5-1 ade4_1.7-19
[101] cli_3.4.1 parallel_4.2.1 igraph_1.3.5 pkgconfig_2.0.3
[105] registry_0.5-1 numDeriv_2016.8-1.1 xml2_1.3.3 foreach_1.5.2
[109] svglite_2.1.0 annotate_1.74.0 multtest_2.52.0 webshot_0.5.4
[113] XVector_0.36.0 rvest_1.0.3 snakecase_0.11.0 callr_3.7.2
[117] detectseparation_0.3 digest_0.6.29 Biostrings_2.64.1 rmarkdown_2.16
[121] cellranger_1.1.0 curl_4.3.3 shiny_1.7.2 nloptr_2.0.3
[125] lifecycle_1.0.3 nlme_3.1-159 jsonlite_1.8.2 Rhdf5lib_1.18.2
[129] viridisLite_0.4.1 fansi_1.0.3 pillar_1.8.1 pkgbuild_1.3.1
[133] KEGGREST_1.36.3 fastmap_1.1.0 httr_1.4.4 survival_3.4-0
[137] remotes_2.4.2 glue_1.6.2 png_0.1-7 iterators_1.0.14
[141] bit_4.0.4 profvis_0.3.7 stringi_1.7.8 blob_1.2.3
[145] memoise_2.0.1 ape_5.6-2
Hi all,
I am working on comparing alpha diversity of several gut type samples, recently discover the breakaway package! I gone through several vignettes and tutorial (https://github.com/adw96/stamps2018/blob/master/estimation/diversity-lab.R), and I am on my way to try it with my dataset.
I encounter several difficulties when using breakaway. My R background is too weak to solve them, may I seek some advice here?
When running breakaway() on a phyloseq object, warning message below shown:
Warning messages: 1: In poisson_model(input_data, cutoff = cutoff) : Cut-off was too low: no data available for estimation
When using the function chao_bunge(), an error occurs:
Error in plot.alpha_estimates(., gut, color = "type") : There are no estimates in this alpha_estimates object!
Finally, when using the function betta(), I encounter the following error:
Error in solve.default(R) : system is computationally singular: reciprocal condition number = 1.48178e-173
It looks like there are multiple problems of my dataset, rather than error from the packages. Are there any ways to solve it? Any advice are highly appreciated!!!
Here attached the output of R as suggested in the instruction post:
Matrix products: default
locale: [1] LC_COLLATE=English_Hong Kong SAR.1252 LC_CTYPE=English_Hong Kong SAR.1252
[3] LC_MONETARY=English_Hong Kong SAR.1252 LC_NUMERIC=C
[5] LC_TIME=English_Hong Kong SAR.1252
system code page: 950
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] forcats_0.5.1 stringr_1.4.0 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
[6] tidyverse_1.3.1 foreach_1.5.2 DivNet_0.4.0 dplyr_1.0.8 breakaway_4.7.6
[11] tibble_3.1.6 magrittr_2.0.2 ggplot2_3.3.5.9000 phyloseq_1.38.0
loaded via a namespace (and not attached): [1] minqa_1.2.4 colorspace_2.0-3 ellipsis_0.3.2 rprojroot_2.0.2
[5] XVector_0.34.0 fs_1.5.2 rstudioapi_0.13 farver_2.1.0
[9] remotes_2.4.2 fansi_1.0.2 lubridate_1.8.0 xml2_1.3.3
[13] codetools_0.2-18 splines_4.1.2 doParallel_1.0.17 cachem_1.0.6
[17] pkgload_1.2.4 ade4_1.7-18 jsonlite_1.8.0 nloptr_2.0.0
[21] broom_0.7.12 cluster_2.1.2 dbplyr_2.1.1 compiler_4.1.2
[25] httr_1.4.2 backports_1.4.1 assertthat_0.2.1 Matrix_1.3-4
[29] fastmap_1.1.0 cli_3.1.1 prettyunits_1.1.1 tools_4.1.2
[33] igraph_1.2.11 gtable_0.3.0 glue_1.6.1 GenomeInfoDbData_1.2.7 [37] reshape2_1.4.4 Rcpp_1.0.8 Biobase_2.54.0 cellranger_1.1.0
[41] vctrs_0.3.8 Biostrings_2.62.0 rhdf5filters_1.6.0 multtest_2.50.0
[45] ape_5.6-2 nlme_3.1-153 iterators_1.0.14 ps_1.6.0
[49] brio_1.1.3 rvest_1.0.2 openxlsx_4.2.5 testthat_3.1.2
[53] lme4_1.1-28 lifecycle_1.0.1 devtools_2.4.3 zlibbioc_1.40.0
[57] MASS_7.3-54 scales_1.1.1 hms_1.1.1 parallel_4.1.2
[61] biomformat_1.22.0 rhdf5_2.38.0 mvnfast_0.2.7 memoise_2.0.1
[65] stringi_1.7.6 S4Vectors_0.32.3 desc_1.4.1 permute_0.9-7
[69] BiocGenerics_0.40.0 boot_1.3-28 pkgbuild_1.3.1 zip_2.2.0
[73] GenomeInfoDb_1.30.1 rlang_1.0.1 pkgconfig_2.0.3 bitops_1.0-7
[77] lattice_0.20-45 Rhdf5lib_1.16.0 labeling_0.4.2 processx_3.5.2
[81] tidyselect_1.1.2 plyr_1.8.6 R6_2.5.1 IRanges_2.28.0
[85] generics_0.1.2 DBI_1.1.2 pillar_1.7.0 haven_2.4.3
[89] withr_2.5.0 mgcv_1.8-38 survival_3.2-13 abind_1.4-5
[93] RCurl_1.98-1.6 modelr_0.1.8 crayon_1.5.0 utf8_1.2.2
[97] tzdb_0.2.0 usethis_2.1.5 readxl_1.3.1 grid_4.1.2
[101] data.table_1.14.2 callr_3.7.0 vegan_2.5-7 reprex_2.0.1
[105] digest_0.6.29 stats4_4.1.2 munsell_0.5.0 sessioninfo_1.2.2