Closed joelnitta closed 2 years ago
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help
for help.
:rocket:
The following problem was found in your submission template:
:wave:
git hash: 43a9f0a2
Package License: MIT + file LICENSE
srr
package)This package is in the following category:
:heavy_check_mark: All applicable standards [v0.1.0] have been documented in this package
Click here to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report()
function from within a local clone of the repository.
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 11 files) and - 1 authors - 3 vignettes - 8 internal data files - 11 imported packages - 4 exported functions (median 41 lines of code) - 30 non-exported functions in R (median 7 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 11| 59.3| | |files_vignettes | 2| 83.0| | |files_tests | 8| 85.7| | |loc_R | 650| 52.9| | |loc_vignettes | 162| 64.9| | |loc_tests | 939| 83.5| | |num_vignettes | 3| 93.1| | |data_size_total | 66301| 81.2| | |data_size_median | 509| 59.9| | |n_fns_r | 34| 35.6| | |n_fns_r_exported | 4| 15.6| | |n_fns_r_not_exported | 30| 42.6| | |n_fns_per_file_r | 3| 42.7| | |num_params_per_fn | 4| 67.6| | |loc_per_fn_r | 10| 40.0| | |loc_per_fn_r_exp | 41| 75.3| | |loc_per_fn_r_not_exp | 7| 29.9| | |rel_whitespace_R | 13| 43.9| | |rel_whitespace_vignettes | 0| 0.0|TRUE | |rel_whitespace_tests | 7| 84.7| | |doclines_per_fn_exp | 55| 68.2| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 9| 22.0| | ---
Interactive network visualisation of calls between objects in package can be viewed by clicking here
goodpractice
and other checks#### 3a. Continuous Integration Badges [![github](https://github.com/joelnitta/canaper/workflows/R-CMD-check/badge.svg)](https://github.com/joelnitta/canaper/actions) **GitHub Workflow Results** |name |conclusion |sha |date | |:---------------|:----------|:------|:----------| |pkgdown |success |43a9f0 |2021-10-26 | |R-CMD-check |success |43a9f0 |2021-10-26 | |Render codemeta |failure |43a9f0 |2021-10-26 | |test-coverage |success |43a9f0 |2021-10-26 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) rcmdcheck found no errors, warnings, or notes #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 100 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- cpr_rand_test | 29 calc_biodiv_random | 16 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 353 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 353
|package |version | |:--------|:---------| |pkgstats |0.0.2.16 | |pkgcheck |0.0.2.86 | |srr |0.0.1.120 |
This package is in top shape and may be passed on to a handling editor
@ropensci-review-bot seeking reviewers
I'm sorry @tdhock, I'm afraid I can't do that. That's something only editors are allowed to do.
@ropensci-review-bot assign @tdhock as editor
Assigned! @tdhock is now the editor
I forgot to mention, I plan to submit this as a paper to Methods in Ecology and Evolution after ROpenSci review (have also added this to the original description)
@ropensci-review-bot assign @KlausVigo as reviewer
@tdhock Only just realised the command there has the wrong syntax - should be add <@name> to reviewers
, like this example. Could you please try again? Nevertheless good that you got this one wrong, because it made us realise we need the bot to tell you it didn't understand. Thanks!
@ropensci-review-bot add @KlausVigo to reviewers by the way it would be more user friendly to have the add editor and add reviewer commands be consistent / analogous, I guess that is what you are doing in https://github.com/ropensci-org/buffy/pull/42 ?
sorry @tdhock, another bot glitch there - bot commands have to be kept as just that, with no additional comments. And yes, the point of the above-linked issue is to aid consistency between analogous commands, so that will hopefully improve soon. In the meantime, could you please try again, again, with just the command? With due apologies and gratitude! :bowing_man:
@ropensci-review-bot add @KlausVigo to reviewers
Can't assign reviewer because there is no editor assigned for this submission yet
@tdhock sorry for all the hiccups! The initial issue body had been edited after your username was added so your username was no longer in the string <!--editor--> @tdhock<!--end-editor-->
I edited it again so you can try again. Thanks for your patience!
@ropensci-review-bot add @KlausVigo to reviewers
@KlausVigo added to the reviewers list. Review due date is 2021-11-24. Thanks @KlausVigo for accepting to review! Please refer to our reviewer guide.
@KlausVigo: If you haven't done so, please fill this form for us to update our reviewers records.
@ropensci-review-bot add @luismurao to reviewers
@luismurao added to the reviewers list. Review due date is 2021-12-21. Thanks @luismurao for accepting to review! Please refer to our reviewer guide.
@luismurao: If you haven't done so, please fill this form for us to update our reviewers records.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing:
~ 9
[ ] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
The canaper
package provides functions to infer the evolutionary processes underlying neoendemism and paleoendemism. The software is an R implementation of the Categorical Analysis of Neo- and Paleo-endemism, which is only available in the Biodiverse software.
I enjoyed reviewing the package, but I found some things that might improve the package.
future::plan(future::multisession)
). I think it would be better to have parameters for parallelization in functions that allow running analysis in parallel (so calls are done inside them).Parallelization calls as in the current version of the package
library(canaper) # This package :)
library(future) # For parallel computing
library(tictoc) # For timing things
library(patchwork) # For composing mult-part plots
library(tidyverse) # For data-wrangling and plottin
data(acacia)
plan(multisession,workers=ncores)
set.seed(071421)
acacia_rand_res <- cpr_rand_test(
acacia$comm, acacia$phy,
null_model = "curveball",
n_reps = 100, n_iterations = 100000,
tbl_out = TRUE)
plan(sequential)
I think this could be more appropriate and easy to use
acacia_rand_res <- cpr_rand_test(
acacia$comm, acacia$phy,
null_model = "curveball",
n_reps = 100, n_iterations = 100000,
tbl_out = TRUE
parallel = TRUE, # Logical var to run processes in parallel
nworkes = 3, # Number of threads
random_seed=071421) # Random seed
future
package I would import itThere is a potential conflict in package dependencies as canaper
depends on R (>= 3.5.0), but it imports phangorn
version 2.8.1, which depends on R (>= 4.1.0). Please change parameter Depends: R (>= 3.5.0) to Depends: R (>= 4.1.0) in the description file or provide a way to install an earlier version of phangorn
(version 2.5.5 should work).
devtools::check()
takes more than six minutes. Building of vignette outputs takes a lot of time (5m 5.2s). Consider using lees number of replicates in examples (i.e. n_reps of canaper::cpr_rand_test
).
══ Warnings ════════════════════════════════════════════════════════════════════
── Warning (test-utils.R:143:3): Functions copied from phyloregion work ────────
rgeos: versions of GEOS runtime 3.10.1-CAPI-1.16.0
and GEOS at installation 3.9.1-CAPI-1.14.2differ
Backtrace:
1. testthat::expect_equal(dense2sparse(biod_example$comm), phyloregion::dense2sparse(biod_example$comm)) test-utils.R:143:2
4. base::loadNamespace(x)
7. base::loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
8. base:::runHook(".onLoad", env, package.lib, package)
13. rgeos:::fun(libname, pkgname)
devtools::test()
finished with no fails and one warning
────────────────────────────────────────────────────────────────────
Warning (test-utils.R:143:3): Functions copied from phyloregion work
rgeos: versions of GEOS runtime 3.10.1-CAPI-1.16.0
and GEOS at installation 3.9.1-CAPI-1.14.2differ
Backtrace:
1. testthat::expect_equal(dense2sparse(biod_example$comm), phyloregion::dense2sparse(biod_example$comm)) test-utils.R:143:2
4. base::loadNamespace(x)
7. base::loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
8. base:::runHook(".onLoad", env, package.lib, package)
13. rgeos:::fun(libname, pkgname)
────────────────────────────────────────────────────────────────────
══ Results ═════════════════════════════════════════════════════════
Duration: 71.8 s
[ FAIL 0 | WARN 1 | SKIP 0 | PASS 255 ]
devtools::spell_check()
Change "signficance" to "significance" in line 132signficance canape.Rmd:132
When benchmarking, I found a strange behavior. After using more than four threads, the program takes more time to finish a randomization test. Here is the code I used (could you please check it? I might be missing something)
library(purrr)
library(tidyverse)
library(tictoc)
library(future)
library(ggplot2)
rm(list = ls())
data(acacia)
acom <- acacia$comm
acom[acom>0] <- 1
n_reps_vec <- c(100,500,1000,1500,2000)
nwokers <- 1:7
acom <- acacia$comm
acom[acom>0] <- 1
rand_times <- seq_along(n_reps_vec) %>% purrr::map_dfr(function(nrep){
rand_t <- nwokers %>% purrr::map_dfr(function(nwork){
if(nwork>1){
plan(multisession,workers=nwork)
}
set.seed(071421)
tictoc::tic()
# Run randomization test
acacia_rand_res <- canaper::cpr_rand_test(
acom, acacia$phy,
null_model = "curveball",
n_reps = n_reps_vec[nrep],
n_iterations = 100000,
tbl_out = TRUE)
aa <- tictoc::toc()
r1 <-data.frame(n_reps = n_reps_vec[nrep] ,
nworkers=nwork,
time_secs = aa$toc - aa$tic)
plan(sequential)
return(r1)
})
return(rand_t)
})
ggplot(rand_times, aes(x = ncores,y=time_secs,color=as.factor(n_reps))) +
geom_line() + theme_classic()
canaper
has already all methods of the CANAPE version of Biodiverse. Could write something about it in the README?@joelnitta please do not hesitate to contact me if you have any questions. Thanks for developing canaper
.
@luismurao thanks so much for your review! I will try to address your comments soon.
@KlausVigo it would be great to have your comments as well so I can make changes incorporating both sets of reviews.
@KlausVigo, do you still intend to review this package?
If not, @tdhock can you please assign a new reviewer, as it is now multiple months past the due date.
Thanks!
Hi @joelnitta! I will finish it today.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 12
The canaper package provides functions to conduct categorical analysis of neo- and paleo-endemism (CANAPE). It is an reimplementation of the Biodiverse software, but there are some functions in the picante
(ses.pd
) and in the PhyloMeasures
package for Faith's phylogenetic diversity with a similar scope.
The canaper package builds on functions from the vegan package to resample (dense) community matrices and borrows biodiversity measures (Faith's phylogenetic diversity and phylogenetic endemism) from the phyloregion package. The picante
package also has some functions for swapping (community) matrices similar to vegan. A comparison between the packages would be useful.
The functions from the phyloregion
package used in the package should be acknowledged properly. There is a reference in the README and in the source that the biodiversity measure are from the phyloregion package. However tools like depsy which trace back contributions will likely check only the DESCRIPTION file for imported packages, contributions etc.. If functions are copied the CRAN Repository Policy suggests to acknowledge contributors (ctb
) in the Author field. I opened an issue to improve the development guide (https://github.com/ropensci/dev_guide/issues/388) with some more additional information. Copying functions can additionally cause copyright problems. An alternative is importing the functions from the package directly.
In contrast to the other reviewer I really like the use of the future
package. Parallelization is dependent on the hardware, the operation system and on other things like if the functions run in a terminal or GUI. To avoid crashes this needs lots of internal checks and guessing and often will not lead to the best result. I prefer to give users the power to adopt the parallelization to their specific needs.
In my opinion the main bottleneck for this kind of analysis is memory consumption and only minor part computing time. Dense community matrices are now frequently getting too large to be hold in memory (RAM). Luckily community matrices are usually extremely sparse and this allows efficient coding. For example the acacia data set, which comes with canaper, has less than 2% non-zero cells and a sparse representation takes only 1/20 of memory of the dense representation.
> library(canaper)
> library(Matrix)
> data("acacia")
> dim(acacia$comm)
[1] 3037 506
> comm <- as.matrix(acacia$comm)
> sum(comm > 0) / prod(dim(comm))
[1] 0.01898522
> object.size(comm)
12597112 bytes
> object.size(Matrix(comm, sparse=TRUE))
656576 bytes
The functions to compute the biodiversity measures make use of the sparsity, but the functions to sample the matrices not. We developed the phyloregion using sparse matrices which resulted in a far lower memory consumption and as a benefit it turned out faster in many cases (https://phyloregion.com/articles/Benchmark.html). With larger data sets we get into a trade-off. We want to use more cores, but this means several copies of the community matrices in memory. Providing a swapping algorithm which can handle sparse matrices would be a huge improvement for the community.
I used the following code for profiling:
library(canaper)
data(acacia)
Rprof(tmp <- tempfile(), memory.profiling = TRUE)
acacia_rand_res <- cpr_rand_test(
acacia$comm, acacia$phy,
null_model = "curveball",
n_reps = 100,
n_iterations = 100000)
Rprof()
summaryRprof(tmp, memory="both")
unlink(tmp)
I just highlight a few lines of the output I find interesting:
...
"calc_biodiv_random" 111.54 92.35 29477.9 0.02 0.02
"cpr_rand_comm" 99.56 82.43 22420.0 1.30 1.08
"simulate.nullmodel" 83.18 68.87 11655.6 1.24 1.03
...
"phylo_community" 5.72 4.74 2994.7 1.60 1.32
"phylo_endemism" 5.14 4.26 2876.1 0.02 0.02
"assertthat::assert_that" 4.66 3.86 1867.0 0.10 0.08
...
"PD" 2.56 2.12 1656.1 0.00 0.00
...
Not unexpected the largest amount of time is spent in re-sampling the community matrices (simulate.nullmodel
).
The canaper package uses defensive programming extensively (calling assertthat::assert_that
), which is generally a very good thing. However, here are some exceptions: The functions calc_biodiv_random
and cpr_rand_comm
are only called internally and are called thousands of times inside of the loop. It should be save to remove all (most) of the calls to assertthat::assert_that
from these functions as parameters have been checked already and are save to use! In fact the algorithm spent a similar amount of time in assertthat::assert_that
than computing the biodiversity measures (phylo_endemism
, PD
).
When computing several biodiversity metrics at once a lot of intermediate results are shared and could be reused. This is not as important and the current approach allows to easily extend the cpr_rand_test
function with additional biodiversity measures.
There are some instances where more than 2 cores are used, e.g. the vignette canape.Rmd, line 83:
plan(multisession, workers = 3)
This could fail on CRAN, when submitting. There the use of maximal 2 cores is allowed and I speak from experience, I got Ripleyed for this once! Again the CRAN Repository Policy is a useful resource and gives some background information: "Checking the package should take as little CPU time as possible, as the CRAN check farm is a very limited resource and there are thousands of packages. Long-running tests and vignette code can be made optional for checking, but do ensure that the checks that are left do exercise all the features of the package. If running a package uses multiple threads/cores it must never use more than two simultaneously: the check farm is a shared resource and will typically be running many checks simultaneously. Examples should run for no more than a few seconds each: they are intended to exemplify to the would-be user how to use the functions in the package."
@KlausVigo thank you very much for the extensive and helpful review!
I just wanted to address one thing first: the reason I copied functions from phyloregion instead of importing them is because of a dependency of phyloregion (betapart), which has a package in Suggests
(snow) that was resetting the random number generator and thus cpr_rand_test()
was failing to return identical results when using parallel computing, even when set.seed()
was used. There is an issue describing the problem in canaper, and a related discussion in the future package.
In the end, I opted to copy the phyloregion functions under the AGPL-3 license, since that fixed the issue and resulted in fewer dependencies. As @HenrikBengtsson mentioned though, it seems another way to fix this (and import phyloregion normally) would be to use future.packages = "snow"
with future.apply::future_lapply()
. Would you prefer that?
(By the way, it is possible that this behavior of snow may cause reproducibility issues with phyloregion as well. I suggest you look into it if you haven't already).
hi @KlausVigo thanks for your detailed review. can you please tell us approximately how many hours you spent on reviewing? (you left that field blank in your review comment)
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/475#issuecomment-1002754119 time 9
Logged review for luismurao (hours: 9)
Hi @joelnitta,
I just made a commit to phyloregion cb819fb. This might solves the problem, at least sessionInfo()
suggests that snow
is not loaded via a namespace any more.
Regards, Klaus
@KlausVigo thank you very much for the extensive and helpful review!
I just wanted to address one thing first: the reason I copied functions from phyloregion instead of importing them is because of a dependency of phyloregion (betapart), which has a package in
Suggests
(snow) that was resetting the random number generator and thuscpr_rand_test()
was failing to return identical results when using parallel computing, even whenset.seed()
was used. There is an issue describing the problem in canaper, and a related discussion in the future package.In the end, I opted to copy the phyloregion functions under the AGPL-3 license, since that fixed the issue and resulted in fewer dependencies. As @HenrikBengtsson mentioned though, it seems another way to fix this (and import phyloregion normally) would be to use
future.packages = "snow"
withfuture.apply::future_lapply()
. Would you prefer that?(By the way, it is possible that this behavior of snow may cause reproducibility issues with phyloregion as well. I suggest you look into it if you haven't already).
@KlausVigo thanks for making that change: I switched back to importing phyloregion, and it passes the tests for consistent results using the same seed in parallel.
I just have one request: would you consider bumping the phyloregion version so that I can require the version with this fix in Imports
? That will prevent this bug popping up for others.
Hi @joelnitta,
I just made a commit to phyloregion cb819fb. This might solves the problem, at least
sessionInfo()
suggests thatsnow
is not loaded via a namespace any more.
@joelnitta nice that this works and it seems to simplify things.
I wasn't aware that ImportFrom(betapart, beta.pair)
and betaprt::beta.pair
made such a large difference with the NAMESPACE. I naively assumed it is about the same and the documentation about these issues does not tell you otherwise. @darunabas is the maintainer of phyloregion
and has to submit it to CRAN. We probably need a week or so to clean up all changes and get the error messages sorted out as we haven't updated the package for a while.
@KlausVigo thanks for making that change: I switched back to importing phyloregion, and it passes the tests for consistent results using the same seed in parallel.
I just have one request: would you consider bumping the phyloregion version so that I can require the version with this fix in
Imports
? That will prevent this bug popping up for others.
@KlausVigo no problem: it is a very subtle error, and one that I would never have resolved without the help of @HenrikBengtsson. As he has mentioned this is likely a much wider problem across many packages on CRAN.
If you and @darunabas are able to release a new version of phyloregion
in a reasonable amount of time on CRAN that would be great.
@KlausVigo - if you are updating phyloregion
then it would be good to also address https://github.com/darunabas/phyloregion/issues/3
I am wondering if we can move forward on this submission? @KlausVigo @luismurao have the comments/suggestions in your reviews been addressed?
@tdhock Sorry, I am the cause for the delay! Since the reviews came back I've gotten busy with other projects, and haven't been able to get back to this yet. I anticipate being occupied by other projects for another week or two. Please remind me again if you don't hear something soon after that.
Hi @joelnitta and @tdhock ! I'm currently the rotating editor in chief for rOpenSci submissions. I wanted to check in and see if you expect to have time to revisit the review feedback soon? Otherwise, we may want to put this review on hold status
Thanks for checking in @emilyriederer. Sorry for falling behind on this. I am still occupied with other things at the moment, but I should be able to respond to the reviews by the end of August if not sooner.
I am sorry for taking so long to respond to the reviewers. Thanks again to both reviewers for their helpful comments. Below I respond to each comment as appropriate.
[x] Concern about parallelization. I think it would be better to have parameters for parallelization in functions that allow running analysis in parallel (so calls are done inside them).
Response: The second reviewer, @KlausVigo disagreed with this point and was in favor of leaving the current approach to parallelization, so I am making no changes.
[x] Instead of suggesting the future
package I would import it
Response: I disagree with this. Parallelization is not required, it is an option to speed up analyses if desired. So I think it makes sense as Suggests
instead of Imports
. No changes made.
/.github/CONTRIBUTING.md
. To make it easier to find though, I have added a link to this document from the README. I also added a link to the ROpenSci code of conduct using the suggested wording in the developer guide (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).Depends: R (>= 4.1.0)
(0f334965fb5a3fe5034eb1bc682b5051548eddb1).[x] devtools::check()
takes more than six minutes. Building of vignette outputs takes a lot of time (5m 5.2s). Consider using lees number of replicates in examples (i.e. n_reps
of canaper::cpr_rand_test
).
Response: I had to include a non-trivial number replicates to produce reasonable results (e.g., in comparison with Mishler et al. 2014 for the CANAPE example vignette) and to demonstrate performance gains from parallelization. But I was able to decrease the number of replicates in the vignettes (f3af4671d679020aeaa97b1a038b3b5a96633f42). The vignettes now take 3m 40s on my computer to compile.
[x] I got one error after upgrading from GEOS 3.9 to 3.10 in Linux. The issue is related to phyloregion dependency.
Response: Yes: phylogregion imports rgeos
, which is going to be retired soon and needs fixing. Correct installation of external dependencies to phyloregion is outside the scope of canaper
. No change made (for now).
[x] devtools::test()
finished with no fails and one warning
Response: On my computer, devtools::test()
now finishes with no fails or warnings (at 93543d399fd34dd6546e56368986a8afa133f190).
[x] devtools::spell_check()
Change "signficance" to "significance" in line 132
Response: Fixed (b04e2e0a8397568bcc615e4aaeb5a0d9b02ad85e)
[x] I was wondering if canaper has already all methods of the CANAPE version of Biodiverse. Could write something about it in the README?
Response: I have added a section in the README comparing canaper
other software (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).
[x] Although the vignettes are very useful and well documented, it would be nice to have methods for plotting spatial data results.
Response: Thanks for the suggestion. I have already included color palettes for plotting results that should be accessible regardless of color vision type. IMHO there are too many customizations that users would want to make to try and implement a one-size-fits-all plotting method. Rather, the results are structured in a way that they should be easily plotted with e.g. ggplot2, and I demonstrate this in the vignettes.
[x] There are some functions in the picante
(ses.pd
) and in the PhyloMeasures
package for Faith's phylogenetic diversity with a similar scope.
Response: There are many R packages that calculate community diversity metrics, and many of them overlap in common metrics like Faith's PD. In the case of canaper
, I need to calculate Faith's PD as part of the CANAPE calculation. I use the phyloregion::PD()
to do this as it is the fastest AFAIK. No changes needed.
[x] The canaper package builds on functions from the vegan package to resample (dense) community matrices and borrows biodiversity measures (Faith's phylogenetic diversity and phylogenetic endemism) from the phyloregion package. The picante package also has some functions for swapping (community) matrices similar to vegan. A comparison between the packages would be useful.
Response: I have added a section to the README comparing canaper
with other software (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).
[x] The functions from the phyloregion package used in the package should be acknowledged properly... An alternative is importing the functions from the package directly.
Response: Please see response above. Functions from the phyloregion package are now imported, not copied (7e8b24950912bfa2a5e10cf4dd9cfa8910d74e76). Please note that phyloregion is still at v1.0.7 as of writing (same version as before @KlausVigo implemented the change that fixed the loading of snow
), so I can't specify the version of phyloregion that would avoid canaper issue #2.
[x] In contrast to the other reviewer I really like the use of the future
package.
Response: I agree, and have left parallelization as it is.
[x] Providing a swapping algorithm which can handle sparse matrices would be a huge improvement for the community.
Response: I agree implementing sparse matrix encoding for generating random matrices would be a huge benefit and will consider it as a feature to be added possibly in the future (any help would be appreciated!).
[x] The functions calc_biodiv_random
and cpr_rand_comm
are only called internally and are called thousands of times inside of the loop. It should be save to remove all (most) of the calls to assertthat::assert_that from these functions as parameters have been checked already and are save to use!
Response: Actually, cpr_rand_comm()
is exported, but point well-taken. I have deleted the redundant checks (bccd0278dd3de5853680d9979ca26c1aa38a7fb3).
Both reviewers left Performance: Any performance claims of the software been confirmed
unchecked. I am not sure why this is the case, as I include tests of increase in performance using parallel computing and demonstrate this feature in a vignette.
I have added @KlausVigo as a reviewer, but not @luismurao, according to their responses to Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer
(6e9b3d55202e2294a6436280ca6b01db13d92974).
The longest running test is "Parallelization decreases calculation time"
. To make this test robust, I have to include a relatively high number of replicates, otherwise by random chance sometimes running in parallel won't be faster than running sequentially.
I have decided to follow 6.1.5.4 Extended tests of rOpenSci Statistical Software Peer Review and treat this as an extended test that can be optionally skipped (it is now skipped by default).
I have documented this in a comment as follows:
# This test is skipped by default because of its long run time.
# To include the test locally, first run
# withr::local_envvar(CANAPER_EXTENDED_TESTS = "true") # nolint
# To include the test in CI (github actions), include the phrase
# 'run-extended' in the commit
devtools::test()
on my computer now finishes in 36 s skipping the extended test, and 181 s when including it.
@tdhock, @emilyriederer sorry again for the long delay, but this should be ready to move forward. Please let me know if there is anything you need from my end.
Also tagging @KlausVigo @luismurao . I know this review was a bit delayed and your availability may have changed. At your convenience, if you could please review @joelnitta 's responses and share whether these address your feedback (template available here) , that would be fantastic.
@emilyriederer If the reviewers have no further comments, can we proceed with this at your discretion as an editor?
Hi @joelnitta - let me follow up with @tdhock since he's the managing editor on this one
hi! Since the reviewers have been tagged two weeks ago and gave no feedback, I think it is OK to move ahead with this submission (the reviewer response is very complete/convincing).
@ropensci-review-bot approve canaper
Approved! Thanks @joelnitta for submitting and @KlausVigo, @luismurao for your reviews! :grin:
To-dos:
@ropensci-review-bot invite me to ropensci/<package-name>
which will re-send an invitation.@ropensci-review-bot finalize transfer of <package-name>
where <package-name>
is the repo/package name. This will give you admin access back.pkgdown
website and are ok relying only on rOpenSci central docs building and branding,
pkgdown
website with a redirecting pagehttps://docs.ropensci.org/package_name
URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
codemetar::write_codemeta()
in the root of your package.install.packages("<package-name>", repos = "https://ropensci.r-universe.dev")
thanks to R-universe.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent).
Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.
We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.
Last but not least, you can volunteer as a reviewer via filling a short form.
Thanks @tdhock!
@ropensci-review-bot finalize transfer of canaper
Date accepted: 2022-09-14 Submitting Author: !--author1-->@joelnitta<!--end-author1-- Other Package Authors: (delete if none) Shawn Laffan (@shawnlaffan) Repository: https://github.com/joelnitta/canaper Version submitted: 0.0.1 Submission type: Stats Badge grade: silver Editor: @tdhock Reviewers: @KlausVigo, @luismurao
Due date for @KlausVigo: 2021-11-24 Due date for @luismurao: 2021-12-21Archive: TBD Version accepted: TBD
Pre-submission Inquiry
General Information
Who is the target audience and what are scientific applications of this package?
Paste your responses to our General Standard G1.1 here, describing whether your software is:
The first implementation within R of an algorithm which has previously been implemented in other languages or contexts
Please include hyperlinked references to all other relevant software: https://github.com/shawnlaffan/biodiverse
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
Note also there is a package website available: https://joelnitta.github.io/canaper/
Badging
What grade of badge are you aiming for? (bronze, silver, gold)
If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold)
Technical checks
Confirm each of the following by checking the box.
[x] I/we have read the guide for authors and rOpenSci packaging guide.
[x] I/we have read the Statistical Software Peer Review Guide for Authors.
[ ] I/we have run
autotest
checks on the package, and ensured no tests fail.autotest
hangs (doesn't produce any output after waiting ca. 20 minutes) on most functions. I was only able to use it oncpr_classify_endem()
.[x] The
srr_stats_pre_submit()
function confirms this package may be submitted.This package:
Publication options
[ ] Do you intend for this package to go on Bioconductor?
Code of conduct