ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

phyloregion #361

Closed KlausVigo closed 4 years ago

KlausVigo commented 4 years ago

Submitting Author: Klaus Schliep (@KlausVigo) Repository:
https://github.com/darunabas/phyloregion

Package: phyloregion
Type: Package
Title: Biogeographic Regionalization and Spatial Conservation
Version: 0.1.0
Authors@R: c(person("Barnabas H.", "Daru", email= "darunabas@gmail.com", role = c("aut", "cre", "cph"), comment = c(ORCID = "0000-0002-2115-0257")),
             person("Piyal", "Karunarathne", role = c("aut")),
             person("Klaus", "Schliep", email="klaus.schliep@gmail.com", role = c("aut"), comment = c(ORCID = "0000-0003-2941-0161")),
             person("Xiaobei", "Zhao", role=c("ctb")), 
             person("Albin", "Sandelin", role=c("ctb")))
Description: R package for biogeographic regionalization and spatial conservation.
Imports: ape, phangorn, Matrix, betapart, fastmatch, parallel, methods, raster, 
         data.table, colorspace, cluster, rgeos, vegan, sp
Suggests: tinytest, knitr
VignetteBuilder: knitr
URL: https://github.com/darunabas/phyloregion
BugReports: https://github.com/darunabas/phyloregion/issues
License: AGPL-3
Encoding: UTF-8
RoxygenNote: 7.0.2
NeedsCompilation: no
Packaged: 2019-11-26 02:36:40 UTC; barnabasdaru
Depends: R (>= 3.6.0)

Scope

The package contains (fast) functions for analysis in biogeography, evolutionary community ecology and some plot functions to visualize results on maps. Other novel applications of this package is for analyses in geospatial conservation. For instance, it has novel tools for mapping standard conservation measures at large scales for species richness, species endemism, species threat, as well their phylogenetic variants: phylogenetic diversity, phylogenetic endemism, evolutionary distinctiveness and global endangerment.

The audience are biologists, biogeographers, conservationist and students with interest in evolutionary community ecology, biogeography and spatial conservation. Other areas that this package will find application is in microbiome analysis.

phyloregion adds the following novelties compared to available packages:

1) ability to utilize sparse matrix and large-scale phylogenies for analysis of biogeographical regionalization and spatial conservation, allowing normal operations of a typical matrix in base R to be done on the sparse matrix, which are more memory efficient and / or faster. 2) novel functions for speedy raw data conversion to sparse community matrix as well as a user-friendly analysis of biogeographical regionalization into completely reproducible R workflows, 3) although the functionality of the package has been developed with biogeographical regionalization in mind, it can accommodate analysis of spatial conservation at large scales such as mapping various biodiversity metrics for conservation ranging from mapping biodiversity hotspots of species richness, endemism, or threat.

Compared to our phyloregion package e.g. phylogenetic beta analysis using available packages on CRAN are roughly 1000 times slower and allocate 1000 times more memory for an reasonable small data set. E.g. this type of analysis is not performed for microbiome analysis likely as it runs fast out of memory with current software (and there may not be good phylogenies / supertrees around).

Additionally the packages PDcalc, GMD from which we borrowed ideas/functions are archived on CRAN.

Furthermore alterations to popular packages like vegan (139) and picante (17) would be very complicated due to the high number of reverse dependencies and the danger of breaking these.
Maybe reverse dependencies (less dependencies and robuster/tiny packages) should also be mentioned in the best-in-category list.

There is a pdf copy of a vignette with benchmarks in the inst folder. The vignette can be generated locally with pkgdown::install_site(), but is not generated with the package as it takes long too compute and would introduce many additional dependencies. In there the equivalence of results is tested, so it can be seen as an addition to unit tests.

maelle commented 4 years ago

Thanks @KlausVigo for your submission! However, we think it is out-of-scope, because it's foremost a methodological package despite the functions doing the "speedy conversion" of raw data.

We hope you'll find another venue for your work! (Maybe Methods in Ecology and Evolution?)