skembel / picante

R tools for integrating phylogenies and ecology
33 stars 17 forks source link

randomizeMatrix with null.model = "independentswap" fails on small communities #26

Closed joelnitta closed 3 years ago

joelnitta commented 3 years ago

Thanks for the great package! I discovered this while trying to write tests for a function that uses randomizeMatrix().

5 x 5 seems to be the minimum size for the community when null.model = "independentswap".

This works:

library(picante)
data(phylocom)
randomizeMatrix(phylocom$sample[1:5,1:5], null.model="independentswap")

But this doesn't:

library(picante)
data(phylocom)
randomizeMatrix(phylocom$sample[1:4,1:5], null.model="independentswap")

It just hangs without finishing or generating an error message (I have to manually force R to quit).

sessionInfo() ``` > sessionInfo() R version 4.1.0 (2021-05-18) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] picante_1.8.2 nlme_3.1-152 vegan_2.5-7 lattice_0.20-44 permute_0.9-5 ape_5.5 loaded via a namespace (and not attached): [1] MASS_7.3-54 compiler_4.1.0 Matrix_1.3-4 parallel_4.1.0 tools_4.1.0 mgcv_1.8-36 [7] Rcpp_1.0.7 splines_4.1.0 grid_4.1.0 cluster_2.1.2 ```
skembel commented 3 years ago

Hi, what is likely happening is that the smaller matrix does not contain any checkerboard co-occurrences. An issue with the independent swap null model is that it will continue looping until it has performed the specified number of swaps. If there are no checkerboard co-occurrences in the matrix, it will just go on forever. I suggest you use the trial swap null model - this approach will attempt to swap a number of times before stopping. An issue in general is that if your matrix is too small and there are no checkerboard co-occurrences to be swapped, the 'randomized' matrix will not change, and you'll probably get non-sensical output (standard deviation of zero, NA/Inf for the SES metrics).

joelnitta commented 3 years ago

Thanks.

Although it's a edge-case, I think it would be nice to preempt this behavior with an error. It would be simple to add a check for minimum number of sites/taxa, but I suppose that isn't getting at the actual cause of the problem. If there is a way to check for any checkerboard co-occurrences before running the randomization, then an informative error could be issued instead of hanging.