ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

citesdb: A high-performance database of shipment-level CITES trade data #292

Closed noamross closed 5 years ago

noamross commented 5 years ago

Submitting Author: Noam Ross (@noamross)
Repository: https://github.com/ecohealthalliance/citesdb Version submitted: 0.1.0 Editor: @melvidoni Reviewer 1: @xavi-rp Reviewer 2: @pachamaltese Archive: https://zenodo.org/record/3234871 Version accepted: 0.2.0


Package: citesdb
Version: 0.1.0
Title: A high-performance database of shipment-level CITES trade data
Description: Provides convenient access to over 40 years and 20 million records of
    endangered wildlife trade data from the Convention on International Trade
    in Endangered Species of Wild Fauna and Flora, stored on a local on-disk,
    out-of memory 'MonetDB' database for bulk analysis.
Authors@R: c(
    person("Noam", "Ross", , "ross@ecohealthalliance.org", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-2136-0000")),
    person("Evan A.", "Eskew", , "eskew@ecohealthalliance.org", role = c("aut"), comment = c(ORCID = "0000-0002-1153-5356")),
    person("EcoHealth Alliance", role = "cph"))
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
ByteCompile: true
URL: https://github.com/ecohealthalliance/citesdb, https://www.cites.org/
BugReports: https://github.com/ecohealthalliance/citesdb/issues
Imports:
  MonetDBLite,
  rappdirs,
  DBI,
  httr,
  R.utils,
  purrr,
  tools,
  dplyr,
  dbplyr
Suggests:
  spelling,
  testthat,
  roxygen2,
  knitr,
  rmarkdown,
  rstudioapi,
  lintr,
  callr,
  here,
  ggplot2,
  rcites
RoxygenNote: 6.1.1
Roxygen: list(markdown = TRUE)
Language: en-US
VignetteBuilder: knitr

Scope

citesdb gives researchers access to the large CITES wildlife trade dataset by syncing it to a local out-of-memory trade database and provides an interface to that local database.

Researchers and conservationists studying the wildlife trade.

Our deprecated cites package used a different framework to access a smaller subset of the data. The rcites package provides complementary data.

Technical checks

Confirm each of the following by checking the box. This package:

I note the relatively low test coverage (~65%) is due to a verbose section of code configuring the RStudio interactive interface. Other than this, the remaining code has >90% coverage.

Publication options

JOSS Options - [x] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [x] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [x] The package is deposited in a long-term repository with the DOI: 10.5281/zenodo.2630836 - (*Do not submit your package separately to JOSS*)
MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

Recommended reviewers: karinorman will be familiar with the database used here via her work on taxadb, JonasGeschke and KevCaz will know about the data and their application through their work on rcites. Feedback from someone involved in metadata projects - e.g.: Bryce Mecum, Auriel Fournier, Kelly Hondula - would also be valuable.

melvidoni commented 5 years ago

Thank you for submitting, @noamross! I will be your editor, so please hold on while I perform editor checks.

melvidoni commented 5 years ago

Editor checks:


Editor comments

Thanks for your submission @noamross!

As the output from goodpractice is perfect I'll start finding reviewers for this.

♥ Ho-ho! Slick package! Keep up the stunning work!

Reviewers: @pachamaltese @xavi-rp Due date: TBD

melvidoni commented 5 years ago

Hello @noamross! @pachamaltese and @xavi-rp have been assigned as reviewers. They will have until May 6th to review the package.

pachadotdev commented 5 years ago

Hi @noamross, @melvidoni and @xavi-rp I started to install the package. One immediate thing to fix is to add the ROpenSci badge so please paste this code next to CI badge and regenerate the README:

[![](https://badges.ropensci.org/292_status.svg)](https://github.com/ropensci/onboarding/issues/292)
eveskew commented 5 years ago

Hi @pachamaltese, that should be taken care of now.

pachadotdev commented 5 years ago

Package Review

Documentation

The package includes all the following forms of documentation:

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 4 hours

Review Comments

I enjoyed reviewing the citesdb package. I found it easy to use.

Below I've highlighted some observations that might make this package even better.

Documentation

Package Metadata

Package Documentation

Vignettes

Function Documentation

Functionality / Implementation

External Dependencies

Section 1.11 of rOpenSci's packaging guides reinforces the good practice of trying to limit external dependencies where possible. This is a good practice not only for the health and longevity of your package but also for marketing purposes; sometimes people working on servers or behind proxies will have limited ability to install packages so fewer dependencies may enable more users. I think this aspect is ok and I wouldn't touch the dependencies.

Readability / Modularity

Technical Implementation Details

Testing

Error Messages

General comments

Like I stated above, there are some fixes that are needed and may improve the package a lot.

The code is easy to read, without strange lines, and my overall impression of it is positive.

65% of test coverage seems to be very low. By adding some elementary checks, this number can be increased a lot.

Unlike my 1st submission to rOpenSci, most of the functions in citedb have adecuate names and can be easily distinguished as those names are not generic (i.e. "filter" or "download").

noamross commented 5 years ago

Note to reviewers: The MonetDBLite package has been (hopefully temporarily) archived on CRAN, so I have switched the citesdb to using a stable GitHub version of MonetDBLite via the Remotes: field in the DESCRIPTION file.

xavi-rp commented 5 years ago

Hi @noamross and @melvidoni, Thanks for giving me the opportunity to review citesdb. I think it's a very nice and useful package. I've enjoyed (and learned!) a lot reviewing it. Please take my comments only as suggestions as I think it's already very well written.

Package Review

Documentation

The package includes all the following forms of documentation:

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 8 (a lot of time spent to try to understand the problem with the vignettes)


Review Comments

I find this package easy to install/use and very useful to explore and use the data from the CITES Trade DB, so congrats to the authors!

It's very useful the inclusion of the metadata functions. And also the gif!!!

The main problem that I found is in the vignette building while running devtools::check(). See https://github.com/ecohealthalliance/citesdb/issues/1#issuecomment-489351239

In devtools::test(), the lints test fails. Already reported by @pachamaltese here: https://github.com/ecohealthalliance/citesdb/issues/2 These are the results of checking the lints with lintr::lint_package() (most of them are probably irrelevant; except maybe #13)

> library(lintr)
> as.data.frame(lint_package())

              filename line_number column_number    type
1          R/connect.R         114            13   style
2          R/connect.R         121            13   style
3          R/connect.R         128            13   style
4          R/connect.R         149            14   style
5  R/connection-pane.R          34            14   style
6  R/connection-pane.R          65            64   style
7  R/connection-pane.R          85            14   style
8         R/download.R          41             7   style
9         R/download.R          42             5   style
10        R/download.R          46             3   style
11        R/download.R          49             5   style
12        R/download.R          59             3   style
13        R/download.R         117            52 warning
14        R/onattach.R          24            13   style
15        R/onattach.R          25             5   style
16        R/onattach.R          42             7   style
17        R/onattach.R          43             5   style
18        R/onattach.R          70            22   style
                                                message
1  Variable and function names should be all lowercase.
2  Variable and function names should be all lowercase.
3  Variable and function names should be all lowercase.
4  Variable and function names should be all lowercase.
5  Variable and function names should be all lowercase.
6  Variable and function names should be all lowercase.
7  Variable and function names should be all lowercase.
8  Variable and function names should be all lowercase.
9  Variable and function names should be all lowercase.
10 Variable and function names should be all lowercase.
11 Variable and function names should be all lowercase.
12 Variable and function names should be all lowercase.
13                           Do not use absolute paths.
14 Variable and function names should be all lowercase.
15 Variable and function names should be all lowercase.
16 Variable and function names should be all lowercase.
17 Variable and function names should be all lowercase.
18 Variable and function names should be all lowercase.
                                                                           line
1                          as_tibble(dbReadTable(cites_db(), "cites_metadata"))
2                             as_tibble(dbReadTable(cites_db(), "cites_codes"))
3                           as_tibble(dbReadTable(cites_db(), "cites_parties"))
4                               observer$connectionClosed("CITESDB", "citesdb")
5                                                    observer$connectionOpened(
6                             paste("SELECT * FROM", table, "LIMIT", rowLimit))
7                          observer$connectionUpdated("CITESDB", "citesdb", "")
8                                     if (dbExistsTable(cites_db(), tblname)) {
9                                            dbRemoveTable(cites_db(), tblname)
10               dbCreateTable(cites_db(), tblname, fields = cites_field_types)
11                                                                   dbExecute(
12   dbWriteTable(cites_db(), "cites_status", make_status_table(version = ver),
13                   paste0("https://api.github.com/repos/", repo, "/releases")
14                                        for (t in dbListTables(cites_db())) {
15                                                 dbRemoveTable(cites_db(), t)
16                          if (dbExistsTable(cites_db(), "cites_shipments") &&
17                                 dbExistsTable(cites_db(), "cites_status")) {
18                       suppressMessages(dbWriteTable(cites_db(), tblnames[i],
                  linter
1      camel_case_linter
2      camel_case_linter
3      camel_case_linter
4      camel_case_linter
5      camel_case_linter
6      camel_case_linter
7      camel_case_linter
8      camel_case_linter
9      camel_case_linter
10     camel_case_linter
11     camel_case_linter
12     camel_case_linter
13 absolute_paths_linter
14     camel_case_linter
15     camel_case_linter
16     camel_case_linter
17     camel_case_linter
18     camel_case_linter

The spelling errors found after running spelling::spell_check_package() and spelling::spell_check_files("README.Rmd") are irrelevant.

Other minor comments:

Congrats again to the authors!!

noamross commented 5 years ago

Thank you for your careful reviews, @pachamaltese and @xavi-rp! We'll get to work on these.

noamross commented 5 years ago

Response to reviewers

Thanks to both @pachamaltese and @xavi-rp for their comments and issues that have helped us to make a more user-friendly package.

Changes implemented

Changes not implemented / justifications

A note: we have learned from the maintainers of MonetDBLite that it will not be returning to CRAN (its current iteration fails on R-devel), but they are working on a successor embedded database package that will replace it and go to CRAN later this year. So, for now, we will host this package on GitHub and replace the database back-end and send to CRAN when the successor package is ready. We've added a note to the README and vignette showing that users need package build tools for the current version.

xavi-rp commented 5 years ago

Hi, I think that the authors have adressed all my review comments and am happy with their changes. Therefore, I recomend citesdb for approving. Again, thank you so much to the editor for giving me the opportunity to review such a nice package and congratulations to the authors!!

melvidoni commented 5 years ago

Thank you @xavi-rp, noted. @pachamaltese what are your thoughts?

pachadotdev commented 5 years ago

@melvidoni I totally agree with this:

Hi, I think that the authors have adressed all my review comments and am happy with their changes. Therefore, I recomend citesdb for approving. Again, thank you so much to the editor for giving me the opportunity to review such a nice package and congratulations to the authors!!

melvidoni commented 5 years ago

Approved! Thanks @noamross for submitting and @pachamaltese and @xavi-rp for your reviews!

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a blog post about your package - either a short introduction to it with one example or a longer post with some narrative about its development or something you learned, and an example of its use. If you are interested, review the instructions, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

pachadotdev commented 5 years ago

@melvidoni IMO there's an additional pending: to apply rOpenSci themes to their vignettes

melvidoni commented 5 years ago

@pachamaltese You just told me before that it was ready to be approved...

noamross commented 5 years ago

Thanks all very much, I believe the RO theme will be applied automatically upon repository transfer and building on docs.ropensci.org.

@pachamaltese and @xavi-rp, we'd like to acknowledge you as rev authors in our DESCRIPTION if that's all right. I believe your ORCiDs are https://orcid.org/0000-0003-2046-0621 (Xavier) and 0000-0003-1017-7574 (Pachá), correct?

xavi-rp commented 5 years ago

Yes, that's my orcid! Thanks!!

pachadotdev commented 5 years ago

Correct id!

You have to download the ropensci theme and edit _pkgdown.yml to create an ropensci-themed pkgdown site

Check github.com/ropensci/tradestatistics for the details

On Wed, May 29, 2019, 9:45 AM Noam Ross notifications@github.com wrote:

Thanks all very much, I believe the RO theme will be applied automatically upon repository transfer and building on docs.ropensci.org.

@pachamaltese https://github.com/pachamaltese and @xavi-rp https://github.com/xavi-rp, we'd like to acknowledge you as rev authors in our DESCRIPTION if that's all right. I believe your ORCiDs are https://orcid.org/0000-0003-2046-0621 (Xavier) and 0000-0003-1017-7574 (Pachá), correct?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOOFOQSEYV75YX25KMTPX2CFVA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWPL6GI#issuecomment-496942873, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UOMWQ7EEWY4XI5MB2STPX2CFVANCNFSM4HD65KAA .

noamross commented 5 years ago

I've checked off everything except the footer, which I'm leaving off in anticipation of a change in workflow: https://github.com/ropensci/software-review-meta/issues/79, and will be in touch with Stefanie about a blog post. Thanks all!

pachadotdev commented 5 years ago

@noamross I saw ropensci.github.io/citesdb It just needs the rOpenSci theme and that's it! Just 2 minutes away from the best result

On Wed, May 29, 2019, 1:03 PM Noam Ross notifications@github.com wrote:

I've checked off everything except the footer, which I'm leaving off in anticipation of a change in workflow: ropensci/software-review-meta#79 https://github.com/ropensci/software-review-meta/issues/79, and will be in touch with Stefanie about a blog post. Thanks all!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOIIUFUDQ65VMSMLMATPX2ZOHA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWP7YDY#issuecomment-497024015, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UOOSONYFUFXBC47LWT3PX2ZOHANCNFSM4HD65KAA .

maelle commented 5 years ago

@pachamaltese we are about to change the guidance actually because @jeroen has been working on centrally building all docs websites so authors do not need to worry about this. The pkgdown config is used but not for styling since rotemplate is used for all. See e.g. docs.ropensci.org/magick.

maelle commented 5 years ago

@pachamaltese https://docs.ropensci.org/tradestatistics/ and https://dev.ropensci.org/job/tradestatistics/ 😉

pachadotdev commented 5 years ago

Thanks @maelle!! It was different when I submitted

On Wed, May 29, 2019, 2:15 PM Maëlle Salmon notifications@github.com wrote:

@pachamaltese https://github.com/pachamaltese we are about to change the guidance actually because @jeroen https://github.com/jeroen has been working on centrally building all docs websites so authors do not need to worry about this. The pkgdown config is used but not for styling since rotemplate is used for all. See e.g. docs.ropensci.org/magick.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/292?email_source=notifications&email_token=ACM7UOM6RIYDJMV4IU3PEH3PX3B2XA5CNFSM4HD65KAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWQGHAA#issuecomment-497050496, or mute the thread https://github.com/notifications/unsubscribe-auth/ACM7UON4LFIHCSZL6Z4RM5DPX3B2XANCNFSM4HD65KAA .

maelle commented 5 years ago

Indeed, very recent change!