Closed jehaab closed 3 years ago
Thanks for your question and for sharing the data! Based on eyeballing your data, it looks like you have very few rare species -- breakaway is essentially estimating that you don't have any rare taxa. You can get around this "issue" by giving the cutoff = Inf
argument to breakaway
, I think. If you give that a shot and run into trouble let us know :)
Amy
Hello Amy,
Thank you for your answer, I feel like I am missing something, I think I am not running breakaway like I should be.
Could you tell me if I used breakaway the right way ?
First I generated a data frame of species counts, with columns as samples and rows as species as follows
sp_break = t(data_count_species)
Then, I generated a frequency count table as follows
It returns a list object (which I joined in previous message)
sp_freq = build_frequency_count_tables(sp_break)
Then I tried to run
b_sp = breakaway(sp_freq)
and I was never able to make it work
Because it returns "Error in convert(input_data) :
Unsupported input type to function convert
. You passed in an object of class list"
So I ran :
sp = convert(t(data_count_species))
Because in the breakaway documentation "input_data | An input type that can be processed by convert()"
which returns
Error in convert(t(data_count_species)) :
input_data is a data.frame or matrix but not a frequency count table.
So I tried to run
sp = convert(sp_freq)
Because sp_freq is a frequency count table
But I have this: "Error in convert(sp_freq) :
Unsupported input type to function convert
. You passed in an object of class list"
I ended up running:
vect_s <- c()
for (i in 1:length(colnames(sp_break))) {
vect_s <- c(vect_s, breakway(sp_freq[[colnames(sp_break)[i]]])$estimate)
}
Which returns "In poisson_model(input_data, cutoff = cutoff) : Cut-off was too low: no data available for estimation"
and I am not able to integrate cutoff = Inf in that for loop (I tried multiple things but I am pretty new at R) I
but if I run
vect_s2 <- c()
for (i in 1:length(colnames(sp_chao))) {
vect_s2 <- c(vect_s2, chao1(sp_freq[[colnames(sp_chao)[i]]])$estimate)
}
sp_chao is the same as sp_break
I works but as I said it returns the exact number that I get when I just manually count the number of species I have in my samples
I hope this is clear.
I really feel like I am missing something very important to run breakaway correctly
Edit: I have tried to run
vect_s <- c()
for (i in 1:length(colnames(sp_break))) {
vect_s <- c(vect_s, breakway(sp_freq[[colnames(sp_break)[i]]])$estimate), cutoff = Inf))
}
it works but returns
"In poisson_model(input_data, cutoff = cutoff) :
Cut-off was too low: no data available for estimation"
for 50 samples
for the rest of the 456 samples it returns also exactly the same number as when I manually count runing:
n_species = rowSums(df_species !=0)
Sorry if that's a lot
Hello Amy,
Sorry if I asked too much of you in my earlier post.
What you said seamed to have worked.
The only concern I have left is what I talked about earlier. I don't understand why breakaway() and chao1() returns exactly the same number as xhen I count my species manually n_species = rowSums(df_species !=0)
Thank you in advance and again sorry for the inconvenience
Jehane
Not a problem at all -- thanks for giving it a go and great work figuring it out! Given that you have so few rare taxa, any sensible method is going to return the number of observed taxa as its best estimate of total taxa -- it seems like there is not a lot of diversity that you didn't observe. That's the reason why breakaway
, chao1
and sample_richness
are all giving you basically the same number.
Thanks for using breakaway -- I hope that helps!
@adw96 That's such a relief thank you :)
Hello, I’m having problems using the breakaway function on my data. I had contacted you for a similar problem which you had resolved. https://github.com/adw96/breakaway/issues/92
At the time I was working on data at the family level. Now I am working at the species level and have a similar problem (I am joining the frequency count table that produces the error). Using the breakaway() function and it always stops at one sample and returns an error: "In poisson_model(input_data, cutoff = cutoff) : Cut-off was too low: no data available for estimation" I checked the model that was used by breakaway on my samples and it is always Poisson. Even with this error it returns an estimate and when I count the number of species detected in each of my samples it is the same as what is returned by breakaway(), do you have an idea as to why this happens?
Furthermore, I tried to use the chao1() function it works, but just like breakaway it returns the number of species detected in each of my samples . FrequencyCountTableError_species.xlsx
Thank you for your help
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale: [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] reshape2_1.4.4 ggpubr_0.4.0 vegan_2.5-7 lattice_0.20-41 permute_0.9-5 forcats_0.5.1 ggplot2_3.3.3 breakaway_4.7.3
loaded via a namespace (and not attached): [1] nlme_3.1-152 fs_1.5.0 usethis_2.0.1 phyloseq_1.30.0 devtools_2.3.2 progress_1.2.2
[7] httr_1.4.2 rprojroot_2.0.2 tools_3.6.3 backports_1.2.1 R6_2.5.0 DBI_1.1.1
[13] lazyeval_0.2.2 BiocGenerics_0.32.0 mgcv_1.8-33 colorspace_2.0-0 ade4_1.7-16 withr_2.4.1
[19] processx_3.4.5 tidyselect_1.1.0 prettyunits_1.1.1 curl_4.3 compiler_3.6.3 cli_2.3.1
[25] Biobase_2.46.0 desc_1.3.0 plotly_4.9.3 scales_1.1.1 callr_3.5.1 stringr_1.4.0
[31] digest_0.6.27 foreign_0.8-75 rio_0.5.16 XVector_0.26.0 pkgconfig_2.0.3 htmltools_0.5.1.1
[37] sessioninfo_1.1.1 fastmap_1.1.0 htmlwidgets_1.5.3 rlang_0.4.10 readxl_1.3.1 rstudioapi_0.13
[43] generics_0.1.0 jsonlite_1.7.2 dplyr_1.0.5 zip_2.1.1 car_3.0-10 magrittr_2.0.1
[49] biomformat_1.14.0 Matrix_1.3-2 Rcpp_1.0.6 munsell_0.5.0 S4Vectors_0.24.4 Rhdf5lib_1.8.0
[55] ape_5.4-1 abind_1.4-5 lifecycle_1.0.0 stringi_1.5.3 carData_3.0-4 MASS_7.3-53.1
[61] zlibbioc_1.32.0 rhdf5_2.30.1 pkgbuild_1.2.0 plyr_1.8.6 grid_3.6.3 parallel_3.6.3
[67] crayon_1.4.1 Biostrings_2.54.0 haven_2.3.1 cowplot_1.1.1 splines_3.6.3 multtest_2.42.0
[73] hms_1.0.0 ps_1.6.0 pillar_1.5.1 igraph_1.2.6 ggsignif_0.6.0 pkgload_1.2.0
[79] codetools_0.2-18 stats4_3.6.3 glue_1.4.2 remotes_2.2.0 data.table_1.13.6 vctrs_0.3.6
[85] foreach_1.5.1 testthat_3.0.2 cellranger_1.1.0 gtable_0.3.0 purrr_0.3.4 tidyr_1.1.2
[91] assertthat_0.2.1 cachem_1.0.4 openxlsx_4.2.3 broom_0.7.4 rstatix_0.7.0 survival_3.2-7
[97] viridisLite_0.3.0 tibble_3.1.0 iterators_1.0.13 memoise_2.0.0 IRanges_2.20.2 cluster_2.1.1
[103] ellipsis_0.3.1