adw96 / breakaway

Species richness with high diversity
68 stars 18 forks source link

why is breakaway returning NULL when trying to plot? #114

Closed gabrielet closed 3 years ago

gabrielet commented 3 years ago

As in the question, I am having this weird behaviour for several of my samples. Problem is that apparently breakaway() is estimating the parameters, but then no plot is returned. Also, there is that Cutoff: 10 after the confidence interval, which I see only for those samples that are returning NULL. For all the samples that are working I see no Cutoff field. Is that a suggestion that I am getting from breakaway to set the cutoff parameter at that specific threshold?

library(breakaway)
library(phyloseq)
library(magrittr)
library(tidyverse)

sample <- cbind.data.frame(Var1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20, 26, 36), Freq = c(1520, 350, 129, 68, 29, 14, 13, 15, 9, 6, 2, 2, 2, 3, 1, 1, 1, 1, 1, 1))

modelBA <- breakaway(sample); plot(modelBA); modelBA;
NULL
Estimate of richness from method breakaway:
  Estimate is 3349
 with standard error 65.16
  Confidence interval: (2273, 15456)
  Cutoff:  10

traceback() is not available.

sessionInfo()

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=it_IT.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4    
 [5] readr_2.0.1     tidyr_1.1.3     tibble_3.1.4    ggplot2_3.3.5  
 [9] tidyverse_1.3.1 magrittr_2.0.1  phyloseq_1.36.0 breakaway_4.7.3

loaded via a namespace (and not attached):
 [1] nlme_3.1-152           bitops_1.0-7           fs_1.5.0              
 [4] lubridate_1.7.10       httr_1.4.2             GenomeInfoDb_1.28.2   
 [7] tools_4.1.1            backports_1.2.1        utf8_1.2.2            
[10] R6_2.5.1               vegan_2.5-7            DBI_1.1.1             
[13] BiocGenerics_0.38.0    mgcv_1.8-36            colorspace_2.0-2      
[16] permute_0.9-5          rhdf5filters_1.4.0     ade4_1.7-17           
[19] withr_2.4.2            tidyselect_1.1.1       compiler_4.1.1        
[22] cli_3.0.1              rvest_1.0.1            Biobase_2.52.0        
[25] xml2_1.3.2             scales_1.1.1           minqa_1.2.4           
[28] XVector_0.32.0         pkgconfig_2.0.3        lme4_1.1-27.1         
[31] dbplyr_2.1.1           rlang_0.4.11           readxl_1.3.1          
[34] rstudioapi_0.13        generics_0.1.0         jsonlite_1.7.2        
[37] RCurl_1.98-1.4         GenomeInfoDbData_1.2.6 biomformat_1.20.0     
[40] Matrix_1.3-4           Rcpp_1.0.7             munsell_0.5.0         
[43] S4Vectors_0.30.0       Rhdf5lib_1.14.2        fansi_0.5.0           
[46] ape_5.5                lifecycle_1.0.0        stringi_1.7.4         
[49] MASS_7.3-54            zlibbioc_1.38.0        rhdf5_2.36.0          
[52] plyr_1.8.6             grid_4.1.1             parallel_4.1.1        
[55] crayon_1.4.1           lattice_0.20-44        Biostrings_2.60.2     
[58] haven_2.4.3            splines_4.1.1          multtest_2.48.0       
[61] hms_1.1.0              pillar_1.6.2           igraph_1.2.6          
[64] boot_1.3-28            reshape2_1.4.4         codetools_0.2-18      
[67] stats4_4.1.1           reprex_2.0.1           glue_1.4.2            
[70] data.table_1.14.0      modelr_0.1.8           vctrs_0.3.8           
[73] nloptr_1.2.2.2         tzdb_0.1.2             foreach_1.5.1         
[76] cellranger_1.1.0       gtable_0.3.0           assertthat_0.2.1      
[79] broom_0.7.9            survival_3.2-13        iterators_1.0.13      
[82] IRanges_2.26.0         cluster_2.1.2          ellipsis_0.3.2        

R.version

> R.version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          1.1                         
year           2021                        
month          08                          
day            10                          
svn rev        80725                       
language       R                           
version.string R version 4.1.1 (2021-08-10)
nickname       Kick Things     
adw96 commented 3 years ago

Hi @gabrielet - great question. What's happened here is that the Poisson model was chosen (modelBA$model). Not all models have a corresponding plot object, including Poisson. Hence, no plot for this sample.

I don't think that this is an issue with breakaway, so I'm going to close this. Feel free to reopen if you disagree.

PS. I encourage great caution when using Poisson estimates. We can see here that the number of species observed once was 1520 and the number of species observed twice was 350. Any reasonable estimator is going to say there's an enormous amount of unknown diversity here -- almost too much to estimate with any accuracy, which is why the Kemp models aren't fitting and it's falling back on a Poisson estimate. Anyway, take care!

gabrielet commented 3 years ago

ok, perfect! thank you!