Closed noamross closed 5 years ago
Thank you for submitting, @noamross! I will be your editor, so please hold on while I perform editor checks.
Thanks for your submission @noamross!
As the output from goodpractice
is perfect I'll start finding reviewers for this.
♥ Ho-ho! Slick package! Keep up the stunning work!
Reviewers: @pachamaltese @xavi-rp Due date: TBD
Hello @noamross! @pachamaltese and @xavi-rp have been assigned as reviewers. They will have until May 6th to review the package.
Hi @noamross, @melvidoni and @xavi-rp I started to install the package. One immediate thing to fix is to add the ROpenSci badge so please paste this code next to CI badge and regenerate the README:
[![](https://badges.ropensci.org/292_status.svg)](https://github.com/ropensci/onboarding/issues/292)
Hi @pachamaltese, that should be taken care of now.
The package includes all the following forms of documentation:
[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions in R help
[x] Examples for all exported functions in R Help that run successfully locally
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).
[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed. Checked yes but please check the vignettes examples!
[ ] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[ ] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines _I
Estimated hours spent reviewing: 4 hours
I enjoyed reviewing the citesdb
package. I found it easy to use.
Below I've highlighted some observations that might make this package even better.
Package Metadata
print(citation("citesdb"), bibtex = TRUE)
works ok and returns a proper BibTeX entry with no warnings.Package Documentation
Vignettes
devtools::install_github()
, and also install.packages()
if you plan to submit to CRAN later.collect()
on all of cites_shipments()
will load a >3 GB data frame into memory!" is something that I would prefer to re-state the example than to include that kind of warning. For example, and given that the package can be used with dplyr, I would obtain an additional benefit from the SQL connections and use dplyr functions to return just what I want instead of loading 3GB in RAM and then filtering (i.e. see https://db.rstudio.com/dplyr/)In the same line with performant examples, I'd run something like this
library(citesdb)
library(dplyr)
con <- cites_db()
DBI::dbListTables(con)
cites_shipments_db <- tbl(con, "cites_shipments")
cites_shipments_db
example_query <- cites_shipments_db %>%
select(Year, Appendix, Taxon, Exporter) %>%
filter(Appendix == "II" & Exporter == "DE")
example_query
example_query_tibble <- as_tibble(example_query)
example_query_tibble
Function Documentation
External Dependencies
Section 1.11 of rOpenSci's packaging guides reinforces the good practice of trying to limit external dependencies where possible. This is a good practice not only for the health and longevity of your package but also for marketing purposes; sometimes people working on servers or behind proxies will have limited ability to install packages so fewer dependencies may enable more users. I think this aspect is ok and I wouldn't touch the dependencies.
Readability / Modularity
glue
package in use here. It can make the code more readable. It could be better than using paste0
and sprintf
in some cases.Technical Implementation Details
check_status
should be renamed since it's too generic and cites_check_status
or similar should be used.Testing
devtools::check()
that I reported at https://github.com/ecohealthalliance/citesdb/issues/1devtools::test()
One test fails. Please check my report here: https://github.com/ecohealthalliance/citesdb/issues/2 goodpractice::gp()
. Please check the results here: https://github.com/ecohealthalliance/citesdb/issues/4. Test coverage should be expanded perhaps with some trivial sanity checks.Error Messages
Like I stated above, there are some fixes that are needed and may improve the package a lot.
The code is easy to read, without strange lines, and my overall impression of it is positive.
65% of test coverage seems to be very low. By adding some elementary checks, this number can be increased a lot.
Unlike my 1st submission to rOpenSci, most of the functions in citedb have adecuate names and can be easily distinguished as those names are not generic (i.e. "filter" or "download").
Note to reviewers: The MonetDBLite package has been (hopefully temporarily) archived on CRAN, so I have switched the citesdb to using a stable GitHub version of MonetDBLite via the Remotes:
field in the DESCRIPTION file.
Hi @noamross and @melvidoni, Thanks for giving me the opportunity to review citesdb. I think it's a very nice and useful package. I've enjoyed (and learned!) a lot reviewing it. Please take my comments only as suggestions as I think it's already very well written.
The package includes all the following forms of documentation:
[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[ ] Vignette(s) demonstrating major functionality that runs successfully locally
devtools::check()
(see comments below). However, reading the Rmd file and running the examples in it manually, it can be seen that they work perfectly.[x] Function Documentation: for all exported functions in R help
[x] Examples for all exported functions in R Help that run successfully locally
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).
Estimated hours spent reviewing: 8 (a lot of time spent to try to understand the problem with the vignettes)
I find this package easy to install/use and very useful to explore and use the data from the CITES Trade DB, so congrats to the authors!
It's very useful the inclusion of the metadata functions. And also the gif!!!
The main problem that I found is in the vignette building while running devtools::check()
. See https://github.com/ecohealthalliance/citesdb/issues/1#issuecomment-489351239
In devtools::test()
, the lints test fails. Already reported by @pachamaltese here: https://github.com/ecohealthalliance/citesdb/issues/2
These are the results of checking the lints with lintr::lint_package()
(most of them are probably irrelevant; except maybe #13)
> library(lintr)
> as.data.frame(lint_package())
filename line_number column_number type
1 R/connect.R 114 13 style
2 R/connect.R 121 13 style
3 R/connect.R 128 13 style
4 R/connect.R 149 14 style
5 R/connection-pane.R 34 14 style
6 R/connection-pane.R 65 64 style
7 R/connection-pane.R 85 14 style
8 R/download.R 41 7 style
9 R/download.R 42 5 style
10 R/download.R 46 3 style
11 R/download.R 49 5 style
12 R/download.R 59 3 style
13 R/download.R 117 52 warning
14 R/onattach.R 24 13 style
15 R/onattach.R 25 5 style
16 R/onattach.R 42 7 style
17 R/onattach.R 43 5 style
18 R/onattach.R 70 22 style
message
1 Variable and function names should be all lowercase.
2 Variable and function names should be all lowercase.
3 Variable and function names should be all lowercase.
4 Variable and function names should be all lowercase.
5 Variable and function names should be all lowercase.
6 Variable and function names should be all lowercase.
7 Variable and function names should be all lowercase.
8 Variable and function names should be all lowercase.
9 Variable and function names should be all lowercase.
10 Variable and function names should be all lowercase.
11 Variable and function names should be all lowercase.
12 Variable and function names should be all lowercase.
13 Do not use absolute paths.
14 Variable and function names should be all lowercase.
15 Variable and function names should be all lowercase.
16 Variable and function names should be all lowercase.
17 Variable and function names should be all lowercase.
18 Variable and function names should be all lowercase.
line
1 as_tibble(dbReadTable(cites_db(), "cites_metadata"))
2 as_tibble(dbReadTable(cites_db(), "cites_codes"))
3 as_tibble(dbReadTable(cites_db(), "cites_parties"))
4 observer$connectionClosed("CITESDB", "citesdb")
5 observer$connectionOpened(
6 paste("SELECT * FROM", table, "LIMIT", rowLimit))
7 observer$connectionUpdated("CITESDB", "citesdb", "")
8 if (dbExistsTable(cites_db(), tblname)) {
9 dbRemoveTable(cites_db(), tblname)
10 dbCreateTable(cites_db(), tblname, fields = cites_field_types)
11 dbExecute(
12 dbWriteTable(cites_db(), "cites_status", make_status_table(version = ver),
13 paste0("https://api.github.com/repos/", repo, "/releases")
14 for (t in dbListTables(cites_db())) {
15 dbRemoveTable(cites_db(), t)
16 if (dbExistsTable(cites_db(), "cites_shipments") &&
17 dbExistsTable(cites_db(), "cites_status")) {
18 suppressMessages(dbWriteTable(cites_db(), tblnames[i],
linter
1 camel_case_linter
2 camel_case_linter
3 camel_case_linter
4 camel_case_linter
5 camel_case_linter
6 camel_case_linter
7 camel_case_linter
8 camel_case_linter
9 camel_case_linter
10 camel_case_linter
11 camel_case_linter
12 camel_case_linter
13 absolute_paths_linter
14 camel_case_linter
15 camel_case_linter
16 camel_case_linter
17 camel_case_linter
18 camel_case_linter
The spelling errors found after running spelling::spell_check_package()
and spelling::spell_check_files("README.Rmd")
are irrelevant.
I would like to know more easily where the DB is stored; for instance when calling cites_status()
If you do cites_db_delete()
and then again cites_db_download()
, the Size on Disk becomes 2.3 Gb instead of 1.1Gb. I don't know why this happens, but it's not nice...
cites_pane()
@examples
L29-30. It should be in the same line (or using {}).
I would prefer single examples for cites_metadata()
, cites_codes()
and cites_parties()
Use Title Case in the package title
Congrats again to the authors!!
Thank you for your careful reviews, @pachamaltese and @xavi-rp! We'll get to work on these.
Thanks to both @pachamaltese and @xavi-rp for their comments and issues that have helped us to make a more user-friendly package.
We have changed the installation method in the README
to use devtools.
We have added information to the cites_status
table indicating the location
of the database files on disk.
We now clear out and disconnect the database tables after updating, which should trigger disk cleanup and avoid doubling database size.
Thanks for the reviewers for sticking with us through a long and merry chase of a bug that was preventing package-building in https://github.com/ecohealthalliance/citesdb/issues/1, which had its origins in a missing token for use with the rcites package as well as a low-level database lock issue. We now cache this the rcites information to avoid having to make remote calls in vignette-building, and also have resolved the DB locking conflict.
We have modified our linter tests to avoid the false positives shown in https://github.com/ecohealthalliance/citesdb/issues/2.
We have renamed the internal function check_status()
to cites_check_status()
to be less generic.
We have made more elaborate help and examples for the cites_db()
,
cites_shipments()
, and metadata functions to illustrate their use and
distinguish between dplyr- and DBI-based workflows.
We've made some small changes to the vignette text and references.
We have opted not to use the glue package and instead stick with the base
R paste()
functions in the interest of limiting dependencies. We believe
this is a minor trade-off.
The low test coverage shown in https://github.com/ecohealthalliance/citesdb/issues/4
is due to tests skipped on CRAN. Setting the environment variable to
NOT_CRAN=true
shows that our test coverage is ~61%. This is lower than
typical, but as we note in the issue above, this is largely due to the
verbose code for interacting with the RStudio connection pane, which cannot be
tested except in an interactive session. Other code coverage is
85%.
We believe that all core functionality is tested, including important
edge-cases and conditions not reflected in the coverage statistic, such as
error handling in multiple sessions and changing up upstream data sources.
A note: we have learned from the maintainers of MonetDBLite that it will not be returning to CRAN (its current iteration fails on R-devel), but they are working on a successor embedded database package that will replace it and go to CRAN later this year. So, for now, we will host this package on GitHub and replace the database back-end and send to CRAN when the successor package is ready. We've added a note to the README and vignette showing that users need package build tools for the current version.
Hi,
I think that the authors have adressed all my review comments and am happy with their changes. Therefore, I recomend citesdb
for approving.
Again, thank you so much to the editor for giving me the opportunity to review such a nice package and congratulations to the authors!!
Thank you @xavi-rp, noted. @pachamaltese what are your thoughts?
@melvidoni I totally agree with this:
Hi, I think that the authors have adressed all my review comments and am happy with their changes. Therefore, I recomend
citesdb
for approving. Again, thank you so much to the editor for giving me the opportunity to review such a nice package and congratulations to the authors!!
Approved! Thanks @noamross for submitting and @pachamaltese and @xavi-rp for your reviews!
To-dos:
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
"[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this here.
Welcome aboard! We'd love to host a blog post about your package - either a short introduction to it with one example or a longer post with some narrative about its development or something you learned, and an example of its use. If you are interested, review the instructions, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.
We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
@melvidoni IMO there's an additional pending: to apply rOpenSci themes to their vignettes
@pachamaltese You just told me before that it was ready to be approved...
Thanks all very much, I believe the RO theme will be applied automatically upon repository transfer and building on docs.ropensci.org
.
@pachamaltese and @xavi-rp, we'd like to acknowledge you as rev
authors in our DESCRIPTION if that's all right. I believe your ORCiDs are https://orcid.org/0000-0003-2046-0621 (Xavier) and 0000-0003-1017-7574 (Pachá), correct?
Yes, that's my orcid! Thanks!!
Correct id!
You have to download the ropensci theme and edit _pkgdown.yml to create an ropensci-themed pkgdown site
Check github.com/ropensci/tradestatistics for the details
On Wed, May 29, 2019, 9:45 AM Noam Ross notifications@github.com wrote:
Thanks all very much, I believe the RO theme will be applied automatically upon repository transfer and building on docs.ropensci.org.
@pachamaltese https://github.com/pachamaltese and @xavi-rp https://github.com/xavi-rp, we'd like to acknowledge you as rev authors in our DESCRIPTION if that's all right. I believe your ORCiDs are https://orcid.org/0000-0003-2046-0621 (Xavier) and 0000-0003-1017-7574 (Pachá), correct?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOOFOQSEYV75YX25KMTPX2CFVA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWPL6GI#issuecomment-496942873, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UOMWQ7EEWY4XI5MB2STPX2CFVANCNFSM4HD65KAA .
I've checked off everything except the footer, which I'm leaving off in anticipation of a change in workflow: https://github.com/ropensci/software-review-meta/issues/79, and will be in touch with Stefanie about a blog post. Thanks all!
@noamross I saw ropensci.github.io/citesdb It just needs the rOpenSci theme and that's it! Just 2 minutes away from the best result
On Wed, May 29, 2019, 1:03 PM Noam Ross notifications@github.com wrote:
I've checked off everything except the footer, which I'm leaving off in anticipation of a change in workflow: ropensci/software-review-meta#79 https://github.com/ropensci/software-review-meta/issues/79, and will be in touch with Stefanie about a blog post. Thanks all!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOIIUFUDQ65VMSMLMATPX2ZOHA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWP7YDY#issuecomment-497024015, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UOOSONYFUFXBC47LWT3PX2ZOHANCNFSM4HD65KAA .
@pachamaltese we are about to change the guidance actually because @jeroen has been working on centrally building all docs websites so authors do not need to worry about this. The pkgdown config is used but not for styling since rotemplate is used for all. See e.g. docs.ropensci.org/magick.
Thanks @maelle!! It was different when I submitted
On Wed, May 29, 2019, 2:15 PM Maëlle Salmon notifications@github.com wrote:
@pachamaltese https://github.com/pachamaltese we are about to change the guidance actually because @jeroen https://github.com/jeroen has been working on centrally building all docs websites so authors do not need to worry about this. The pkgdown config is used but not for styling since rotemplate is used for all. See e.g. docs.ropensci.org/magick.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOM6RIYDJMV4IU3PEH3PX3B2XA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWQGHAA#issuecomment-497050496, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UON4LFIHCSZL6Z4RM5DPX3B2XANCNFSM4HD65KAA .
Indeed, very recent change!
Submitting Author: Noam Ross (@noamross)
Repository: https://github.com/ecohealthalliance/citesdb Version submitted: 0.1.0 Editor: @melvidoni Reviewer 1: @xavi-rp Reviewer 2: @pachamaltese Archive: https://zenodo.org/record/3234871 Version accepted: 0.2.0
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
citesdb gives researchers access to the large CITES wildlife trade dataset by syncing it to a local out-of-memory trade database and provides an interface to that local database.
Researchers and conservationists studying the wildlife trade.
Our deprecated cites package used a different framework to access a smaller subset of the data. The rcites package provides complementary data.
Technical checks
Confirm each of the following by checking the box. This package:
I note the relatively low test coverage (~65%) is due to a verbose section of code configuring the RStudio interactive interface. Other than this, the remaining code has >90% coverage.
Publication options
JOSS Options
- [x] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [x] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [x] The package is deposited in a long-term repository with the DOI: 10.5281/zenodo.2630836 - (*Do not submit your package separately to JOSS*)MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct
Recommended reviewers: karinorman will be familiar with the database used here via her work on taxadb, JonasGeschke and KevCaz will know about the data and their application through their work on rcites. Feedback from someone involved in metadata projects - e.g.: Bryce Mecum, Auriel Fournier, Kelly Hondula - would also be valuable.