ropensci / software-review

rOpenSci Software Peer Review.
295 stars 104 forks source link

`dataaimsr`: Australian Institute of Marine Science (AIMS) Data Platform API Client which provides easy access to AIMS Data Platform scientific data and information #428

Closed dbarneche closed 3 years ago

dbarneche commented 3 years ago

Submitting Author: Diego R. Barneche (@dbarneche)
Other Authors: AIMS Datacentre team: (Jeffrey L. Sheehan, Greg Coleman, Mark Rehbein) Repository: https://github.com/open-AIMS/dataaimsr Version submitted: 1.0.2 Editor: @ldecicco-USGS Reviewers: @boshek Reviewers: @techisdead Archive: TBD
Version accepted: TBD


Package: dataaimsr
Type: Package
Title: AIMS Data Platform API Client
Version: 1.0.2
Authors@R: c(person("Diego R.", "Barneche", email = "d.barneche@aims.gov.au", role = c("aut", "cre")), person("AIMS Datacentre", role = "aut", email = "adc@aims.gov.au"))
Description: AIMS Data Platform API Client which provides easy access to AIMS Data Platform scientific data and information.
Depends:
    R (>= 3.6.0)
Imports:
    httr,
    jsonlite,
    parsedate
Suggests:
    knitr,
    rmarkdown,
    plyr,
    tidyverse,
    ggmap,
    testthat
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
VignetteBuilder: knitr
URL: https://open-aims.github.io/dataaimsr
BugReports: https://github.com/open-aims/dataaimsr/issues
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1

Scope

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

ldecicco-USGS commented 3 years ago

Hi @dbarneche ! Thanks for your submission. I'll be the editor for this process. The "fit" for the package and rOpenSci looks great, a very suitable package.

The first step is the "editor checks". Once this is completed, I'll be asking around for reviewers.

Editor checks:


Editor comments

The standard devtools::check() resulted in:

> checking package dependencies ... NOTE
  Suggests orphaned package: 'ggmap'

> checking for unstated dependencies in vignettes ... NOTE
  '::' or ':::' imports not declared from:
    'dplyr' 'tidyr'

0 errors v | 0 warnings v | 2 notes x

A standard tool we use is goodpractice::gp(), which resulted in:

Preparing: rcmdcheck
── GP dataaimsr ────────────────────────────────────────────────────

It is good practice to

  ✖ write unit tests for all functions, and all
    package code in general. 80% of code lines are covered
    by test cases.

    R/aims_data.R:169:NA
    R/aims_data.R:170:NA
    R/aims_data.R:171:NA
    R/aims_data.R:176:NA
    R/aims_data.R:177:NA
    ... and 22 more lines

  ✖ avoid long code lines, it is bad for
    readability. Also, many people prefer editor windows
    that are about 80 characters wide. Try make your lines
    shorter than 80 characters

    R\aims_data.R:29:1

  ✖ fix this R CMD check NOTE: '::' or ':::'
    imports not declared from: 'dplyr' 'tidyr'

Coverage looks pretty good:

covr::package_coverage()
dataaimsr Coverage: 80.00%
R/zzz.R: 0.00%
R/aims_data.R: 80.00%
R/page_data.R: 88.24%
R/helpers.R: 90.32%
R/filter_values.R: 90.91%
R/aims_data_doi.R: 100.00%
R/expose_attributes.R: 100.00%

You suggest tidyverse, which is typically not recommended: https://www.tidyverse.org/blog/2018/06/tidyverse-not-for-packages/

Because some package installations will install all suggested packages, suggesting tidyverse could be a major undertaking for your users. Glancing though your vignettes, it looks like you could suggest just dplyr and tidyr, and probably purrr if you consider dropping the plyr::llply suggestion and replace it with purrr:mapdf.

On that note, plyr is a package you'll probably want to avoid if possible. If you are already steering your users towards tidyverse, I think it's not a stretch to drop the plyr suggestion and move to a purrr alternative.

Many of the help file (.Rd generated from roxygen) don't have working examples, which is always helpful for users.

Due to the nature of the package, you may want to have a look here for additional testing strategies: https://books.ropensci.org/http-testing/index.html

So with that being said, I'll start the process for finding reviewers, but I'd encourage you to consider the above comments in the meantime.

You can add the rOpenSci 'under review' badge to your README:

You may use rodev::use_review_badge(), rodev::use_review_badge(428). Badge URL is https://badges.ropensci.org/428_status.svg.

Full link should be:

[![](https://badges.ropensci.org/428_status.svg)](https://github.com/ropensci/software-review/issues/428)
dbarneche commented 3 years ago

Hi @ldecicco-USGS ! Thank you very much for such a prompt and thorough feedback. I'll start working on your suggestions today. [Edit:]: all incorporated, thanks!

I have a parallel question in the meantime as well, if you don't mind. We are hoping to submit a paper to JOSS about our package once (and if) it is properly approved by rOpenSci. Is it OK for us to keep working on the paper while the package is being reviewed here?

Also, the data team at AIMS is also developing a web-based data explorer for the same datasets. It serves as a complementary tool for users who prefer a more interactive form of navigation/exploration. One of the features of the tool is to export an R script based on dataaimsr to download the data type/window chosen by the user. There are other features such as downloading the data in csv or a printing the time series as png.

This is my first time submitting a package to rOpenSci and a paper to JOSS. In your opinion, should we add a section about this complementary tool to the paper, or should it be another standalone article?

karthik commented 3 years ago

Is it OK for us to keep working on the paper while the package is being reviewed here?

Yes very much so. The paper will evolve further in the JOSS review.

In your opinion, should we add a section about this complementary tool to the paper, or should it be another standalone article?

You can certainly mention it. A web based explorer for data does not qualify as research software under JOSS so you can't submit that as a separate paper anyway.

ldecicco-USGS commented 3 years ago

Hey @dbarneche, just letting you know the status. @boshek has graciously offered to be a reviewer (🙏thanks!) and I've been reaching out to others a bit more local to Australia. My favorite excuse so far is "I'd love to but I'm on a voyage around Antarctica at the moment." So, I'm still actively looking for a second reviewer, and just wanted to let you know.

ldecicco-USGS commented 3 years ago

whoops! Certainly didn't mean to close this!

dbarneche commented 3 years ago

Hello @ldecicco-USGS, thanks very much for letting me know.

boshek commented 3 years ago

@ldecicco-USGS here is my review of the dataaimsr. Overall a really great package that accesses a valuable data source. I have tried to summarize my key topics below. I think the biggest two point for me are:

  1. Spend some additional time to make sure the API keys and the presence of an internet connections are checked for both users and for CI/CRAN services. This will help you debug future issues and provides some nice guardrails for users.
  2. I'd advocate strongly that aims_data reliably returns a data.frame and exposes those other elements (citation etc) as attributes.

Great job with this package!

Package Review

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [X] A short summary describing the high-level functionality of the software
  • [X] Authors: A list of authors with their affiliations
  • [X] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [x] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Estimated hours spent reviewing: 8.


Review Comments

Documentation

Paper

library(dataaimsr)
library(sf)
library(rnaturalearth)
library(tmap)

# see ?aims_data_doi for argument names
ssts_doi <- aims_data_doi("temp_loggers")
sdata <- aims_data(ssts_doi, api_key = NULL,
                   summary = "summary-by-series")

oceania_boundary <- ne_countries(continent = "oceania", returnclass = "sf") %>% 
  st_transform(3112)

sdata_sf <- sdata %>%
  dplyr::mutate(cols = cal_obs * 1e-3) %>%
  tidyr::drop_na(lon, lat) %>% 
  st_as_sf(coords = c("lon", "lat"), crs = 4326) %>% 
  st_transform(3112)

tm_shape(oceania_boundary) +
  tm_polygons() +
  tm_shape(sdata_sf) +
  tm_dots(col = 'cols', size = 0.1)

General

API keys

ldecicco-USGS commented 3 years ago

Thanks so much @boshek 🎉!

@dbarneche, feel free to begin your response to this review. I've sent another request out today for reviewer number 2. If I don't hear back soon, I'll do a review myself.

techisdead commented 3 years ago

Hey - just jumping in here to say I'd be happy to review. I'm not a regular ROpenSci contributor but an everyday programmer/builder/reviewer of R packages at @SymbolixAU and in Australia (and very interested in ecological open data).

I can't look at it for a couple of days but will put some time aside on the weekend, if that's cool.

dbarneche commented 3 years ago

Hi @boshek, @ldecicco-USGS and @techisdead, your time and contribution is very much appreciated. I apologise because I have many projects to deal with this week, but I will start working on my response early next week. Best wishes, D

ldecicco-USGS commented 3 years ago

@techisdead , thanks for volunteering!

Our reviewers guide details what we look for in a package review and includes links to example reviews. Our standards are detailed in our packaging guide, and we provide reviewer template for you to use. Please make sure you do not have a conflict of interest preventing you from reviewing this package. If you have questions or feedback, feel free to let me know.

We ask reviewers to complete reviews in three weeks, so if you were to finish a review this weekend, that would be fantastic.

techisdead commented 3 years ago

Great - thanks!

ldecicco-USGS commented 3 years ago

@techisdead , let me know if you have any questions on the review. Feel free to email me directly as well (usually that is how I would initiate the conversation, but since you volunteered - I don't have your email).

techisdead commented 3 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [x] A short summary describing the high-level functionality of the software
  • [x] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Estimated hours spent reviewing: 4


Review Comments

Great package and data sets. I'm excited to see how the API evolves and think it's great this package is being developed alongside.

Overall I agree with the observations made by @ldecicco-USGS and @boshek (exept I liked the long README :-) )

I found the package well documented and easy to use on the whole. I ran into an issue with tests that hadn't been noted previously - perhaps something has changed on the server side ? Happy to help figure it out.

More details follow:

Also devtools::test() failed for me.

> devtools::test()
Loading dataaimsr
Testing dataaimsr
✓ |  OK F W S | Context
x |  30 4     | aims_data [6.1 s]                                                                                           
─────────────────────────────────────────────────────────
Failure (test-aims_data.R:16:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:22:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:41:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sc)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:45:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sd)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

This is because the structure returnes from the two endpoints is different

aims_data(weather_doi, ...) returns data of the form

> str(check_wa)
List of 3
 $ metadata: chr "Metadata record https://doi.org/10.25845/5c09bf93f315d"
 $ citation: chr "Australian Institute of Marine Science (AIMS). 2009, Australian Institute of Marine Science Automatic Weather S"| __truncated__
 $ data    :'data.frame':   1297 obs. of  14 variables:

 ...
> names(check_wa)
[1] "metadata" "citation" "data"    

Whereas citation is nested under $metadata for aims_data(ssts_doi, ... )


> str(check_sa)
List of 3
 $ metadata:List of 2
  ..$ metadata: chr "Metadata record https://doi.org/10.25845/5b4eb0f9bb848"
  ..$ citation: chr "Australian Institute of Marine Science (AIMS). 2017, AIMS Sea Temperature Observing System (AIMS Temperature Lo"| __truncated__
 $ NA      : NULL
 $ data    :'data.frame':   433 obs. of  22 variables:

 ...
> names(check_sa)
[1] "metadata" NA         "data"    

Has something changed in the underlying API?

This impact the page_data tests too


───────────────────────────────────────────────────────────────────
✓ |   6       | aims_data_doi                                                                                               
✓ |   7       | expose_attributes                                                                                           
✓ |  12       | filter_values [0.9 s]                                                                                       
✓ |  39       | helpers [0.5 s]                                                                                             
x |  29 5     | page_data [4.7 s]                                                                                           
──────────────────────────────────────────────────────────────────────
Failure (test-page_data.R:22:3): Correct structure
`check_sa` has length 4, not length 5.

Failure (test-page_data.R:30:3): Correct structure
`check_sb` has length 2, not length 3.

Failure (test-page_data.R:31:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-page_data.R:36:3): Correct structure
`check_sa_n` has length 2, not length 3.

Failure (test-page_data.R:37:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa_n)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 
─────────────────────────────────────────────────────────────────────

══ Results ═══════════════════════════════════════════════════════════
Duration: 12.3 s

[ FAIL 9 | WARN 0 | SKIP 0 | PASS 123 ]
ldecicco-USGS commented 3 years ago

Thanks so much @techisdead ! @dbarneche next step is a response to the reviewers.

dbarneche commented 3 years ago

Hi @ldecicco-USGS and @techisdead, Thanks very much for this. It is on the top of my TODO list for this week. Best wishes, Diego

dbarneche commented 3 years ago

I would like to thank the editor @ldecicco-USGS and both referees @boshek and @techisdead for dedicating their time to reviewing our package. I have tried my best to incorporate all suggestions raised. Changes have been summarised in NEWS.md; below I provide responses with the corresponding commit links. I look forward to your response. Best wishes, Diego

In response to suggestions made by @boshek:

@ldecicco-USGS here is my review of the dataaimsr. Overall a really great package that accesses a valuable data source. I have tried to summarize my key topics below. I think the biggest two point for me are:

  1. Spend some additional time to make sure the API keys and the presence of an internet connections are checked for both users and for CI/CRAN services. This will help you debug future issues and provides some nice guardrails for users.
  2. I'd advocate strongly that aims_data reliably returns a data.frame and exposes those other elements (citation etc) as attributes.

Great job with this package!

Response: Thank you very much @boshek. I have tried my best to incorporate almost all of your excellent suggestions. There were only two instances where I diverged in opinion, below I provide a point-by-point response to each comment.

Package Review

  • [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • [ ] Installation instructions: for the development version of package and any non-standard dependencies in README
  • [x] Vignette(s) demonstrating major functionality that runs successfully locally
  • [ ] Function Documentation: for all exported functions
  • [ ] Examples (that run successfully locally) for all exported functions
  • [ ] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [x] A short summary describing the high-level functionality of the software
  • [x] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • [x] Installation: Installation succeeds as documented.
  • [x] Functionality: Any functional claims of the software been confirmed.
  • [x] Performance: Any performance claims of the software been confirmed.
  • [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 8.

  • [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Response: You are now officially listed as a package reviewer together with @ldecicco-USGS and @techisdead. Change implemented in open-aims/dataaimsr@c4a26fb.

Review Comments

Documentation

  • Statement of Need: I think that the README could actually be made shorter. I think it would be helpful to outline why a user would want to use R to interact with this data. To me the specific value add here is that you are connecting the AIMS api with all the amazing capabilities of R. I think the text in the Overview section is valuable. I just wonder if it could be condensed a little to get users to specifically to the installation instructions as really what they are after here is the software itself.

Response: I appreciate this suggestion. I have expanded the overview section by a few sentences to better highlight your comment. However, in case it's still OK with you, I'd prefer to maintain the blurbs about the datasets on the README. That is because I suspect that many users do not even know about the existence of these datasets, and so having a summary about them should help break that barrier. That might have been why @techisdead appreciated the longer README? I do take your point though that users may be looking for installation instructions above anything else, so I have moved those two paragraphs to the end of README, after the installation instructions and usage examples. I hope this serves as a good compromise, but please do let me know in case you feel strongly about this. Changes implemented in open-aims/dataaimsr@11dbdf2.

  • Installation instructions: I would revamp the installation instructions so that getting the API key is explictly a step in the installation instructions. I would also include information about how the API key should be kept secret. You might consider linking to this article.

Response: Excellent suggestion. Incorporated in open-aims/dataaimsr@27aa4c7.

  • Vignette(s): See comment in API keys section. I don't think ggmap is a particularly useful here. Given ggmaps' own API requirements I would strongly advise against including it here as a dependency and including it in the vignettes. If you really want a map, I would suggest try an sf approach. But I think there are plenty of tutorials around to tackle mapping in R and it detracts from the core focus of dataaimsr.

Response: Thanks for this suggestion. I built on the example code you kindly provided below for the maps to create a plot method (please see more about this below). The consequence of that change though is that ggmap is no longer a requirement for the vignettes nor a dependency of the package. Change implemented in open-aims/dataaimsr@557f41a.

  • Examples: I think the example in aims_data could be significantly shortened. The examples section is meant to be illustrative rather than comprehensive. But I wouldn't waste all that great code. This is an opinion so not a requirement for acceptance. I would simply move it to a vignette. I am a bit uncertain how you are going to manage with CRAN, the requirement for the API keys and the ability to run examples. Here I think you might need to build in some ability for the function to 'fail gracefully' when the API key isn't available. That is, don't error but instead return an informative message. That same applies to checking if someone has a) internet and b) if the API is down for whatever reason. Again this is particularly relevant if CRAN submission is desired. I think you can wrap them in a try function. That might work.

Response: Thanks for this suggestion. I hope this is OK, but I would prefer to keep the extended list of examples. I think many people might be a bit lost when first using the package---for the same reasons raised in the comment RE: REAMDE above---so those examples should help them better understand the capabilities of the package. As for the failures, thank you! I have implemented two steps to fail the function gracefully. One is related to not having internet connection (as per your suggestion below using curl::has_internet()), and the other happens in case the input filter values are incorrect. Both implemented in open-aims/dataaimsr@3ead676.

I'm unsure though if I should also implement something along these lines for the internal function handle_error?

Community guidelines: Missing a CONTRIBUTING guide and (though not required) a CODE_OF_CONDUCT file.

Response: Done in open-aims/dataaimsr@3f5c9b1.

Paper

  • I think it is worth adopting stronger language about the APi keys in the paper to ensure that people don't expose those publicly.

Response: Agreed 100%. Changes implemented in open-aims/dataaimsr@27aa4c7 for the README.md section, and in open-aims/dataaimsr@557f41a for the vignettes.

  • I would modify the map presented in the paper. This isn't a scientific map. Either I would remove it or I would provide something that use open data. Something like this roughed up example would work:
library(dataaimsr)
library(sf)
library(rnaturalearth)
library(tmap)

# see ?aims_data_doi for argument names
ssts_doi <- aims_data_doi("temp_loggers")
sdata <- aims_data(ssts_doi, api_key = NULL,
                   summary = "summary-by-series")

oceania_boundary <- ne_countries(continent = "oceania", returnclass = "sf") %>% 
  st_transform(3112)

sdata_sf <- sdata %>%
  dplyr::mutate(cols = cal_obs * 1e-3) %>%
  tidyr::drop_na(lon, lat) %>% 
  st_as_sf(coords = c("lon", "lat"), crs = 4326) %>% 
  st_transform(3112)

tm_shape(oceania_boundary) +
  tm_polygons() +
  tm_shape(sdata_sf) +
  tm_dots(col = 'cols', size = 0.1)

Response: Thank you very much for your suggestions. I hope you don't mind, but I took the liberty to build on the above code not only to change the paper but also to create a plot method for the new class (more on this below). We have also 1) updated the list of authors, 2) added a blurb about the sister web visualisation tool, and 3) added references and text for the quality control calibration routine. Changes implemented in open-aims/dataaimsr@2b5cd66.

General

  • I think that your R dependency could be 3.3.0 rather than 3.6.0.

Response: This dependency has been updated in open-aims/dataaimsr@c4a26fb.

  • I am wondering about the usefulness of the aims_data_doi function. Would it not be more straight forward for a user to simply to expose the target argument directly to aims_data and other functions like expose_attributes?

Response: I agree that aims_data_doi was not that useful. Your suggested changes have been fully incorporated in open-aims/dataaimsr@a837afd.

  • I would suggest that adding aims_* prefixes to all exposed functions would be a good idea.

Response: Done in open-aims/dataaimsr@ec8f85c.

  • I think you reconsider exposing most of the functions that handle http requests. That would things like handle_error and page_data. Those all can be internal functions and don't need to be user facing.

Response: Perfect. Now the only exported functions are:

> library(dataaimsr)
> dataaimsr::
dataaimsr::aims_citation           dataaimsr::aims_expose_attributes  dataaimsr::aims_metadata           dataaimsr::is_aimsdf               
dataaimsr::aims_data               dataaimsr::aims_filter_values      dataaimsr::aims_parameters         

Change implemented in open-aims/dataaimsr@d32bc86. I still don't know how to prevent the non-exported functions from appearing in the pkgdown reference page---any ideas?

  • If the object returned by aims_data was always just a data.frame I think that would be beneficial. Instead of returning a list I think you could stored those other elements as attributes attached to the object and then construct extract functions. So something like aims_citation which takes the object returned by aims_data and return the citation. Typically returning objects of different classes depending on the value of an argument is considered not best practice.
  • I think this approach could be extended to parameters
  • I don't consider this required for acceptance but I would highly recommend considering adding classes to the aims_data object and implementing print and plot methods.

Response: I hope you don't mind me amalgamating all these excellent suggestions for a single answer, but basically I have implemented all of them. aims_data now returns a data.frame of class aimsdf with 5 attributes, 3 of which can be extracted using the convenience functions aims_citation, aims_metadata and aims_parameters. The other two attributes, type and target were added to facilitate plotting methods. plot, summary, print have been extended to work with class aimsdf. The plot can be either a time series or a map. These changes were implemented in open-aims/dataaimsr@3ead676, open-aims/dataaimsr@799e4c2 and open-aims/dataaimsr@1330e0a.

  • Add something like curl::has_internet() to fail "gracefully" when lacking an internet connection.

Response: As mentioned above, done in open-aims/dataaimsr@3ead676.

  • I think the README could benefit from a short usage example.

Response: Done in open-aims/dataaimsr@f2af2a3.

  • In the DESCRIPTION I think the url could also include a link to the API. They just need to be separated by a comma.

Response: Done in open-aims/dataaimsr@c4a26fb.

API keys

Response: Thanks for these examples. I decided to implement the pre-compilation trick suggested. I would appreciate it if you could confirm whether the way I adapted the original precompile.R file is properly carrying over the original license. Changes in open-aims/dataaimsr@15974b4.

  • For testing, I think the best approach if you want to test directly on the API is to add a token as GitHub secret. However ropensci does have a suite mocking tools that facilitate exactly this type of thing. For example: https://github.com/ropensci/webmockr

Response: Thanks for raising this issue. I have also removed the key from the test suite. I decided to adopt the httptest mock approach as suggested below by @techisdead. Changes in open-aims/dataaimsr@f252bdd.

In response to suggestions made by @techisdead:

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • [x] Installation instructions: for the development version of package and any non-standard dependencies in README
  • [x] Vignette(s) demonstrating major functionality that runs successfully locally
  • [x] Function Documentation: for all exported functions
  • [x] Examples (that run successfully locally) for all exported functions
  • [x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [x] A short summary describing the high-level functionality of the software
  • [x] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • [x] Installation: Installation succeeds as documented.
  • [x] Functionality: Any functional claims of the software been confirmed.
  • [x] Performance: Any performance claims of the software been confirmed.
  • [ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 4

  • [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Response: Thank you very much your review @techisdead. You are now officially listed as a package reviewer together with @ldecicco-USGS and @boshek. Change implemented in open-aims/dataaimsr@c4a26fb.

Review Comments

Great package and data sets. I'm excited to see how the API evolves and think it's great this package is being developed alongside.

Overall I agree with the observations made by @ldecicco-USGS and @boshek (exept I liked the long README :-) )

I found the package well documented and easy to use on the whole. I ran into an issue with tests that hadn't been noted previously - perhaps something has changed on the server side ? Happy to help figure it out.

Response: I'm glad you liked the REAMDE page! I apologise for the issue with the tests. It was indeed a hiccup on the API side of things, but it has been fixed already. Tests (on my machine) now all pass, although I am having trouble getting them to work via GitHub Actions as well---I think it is an artefact of needing sf and rgdal external libraries and source installation for plotting. Should be fixed soon I hope. As per my answer to @boshek above, I have moved the paragraphs about the datasets to the end of README, after the installation instructions and usage examples; this way user can jump straight onto installation instructions if that's what they're after. Changes implemented in open-aims/dataaimsr@11dbdf2.

More details follow:

  • Tests: I agree with @boshek about not exposing an api key in your tests. I've used {httptest} to mock an api locally and it works well, but using github secrets or {webmockr} would be good choices too

Also devtools::test() failed for me.

> devtools::test()
Loading dataaimsr
Testing dataaimsr
✓ |  OK F W S | Context
x |  30 4     | aims_data [6.1 s]                                                                                           
─────────────────────────────────────────────────────────
Failure (test-aims_data.R:16:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:22:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:41:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sc)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-aims_data.R:45:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sd)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

This is because the structure returnes from the two endpoints is different

aims_data(weather_doi, ...) returns data of the form

> str(check_wa)
List of 3
 $ metadata: chr "Metadata record https://doi.org/10.25845/5c09bf93f315d"
 $ citation: chr "Australian Institute of Marine Science (AIMS). 2009, Australian Institute of Marine Science Automatic Weather S"| __truncated__
 $ data    :'data.frame': 1297 obs. of  14 variables:

 ...
> names(check_wa)
[1] "metadata" "citation" "data"    

Whereas citation is nested under $metadata for aims_data(ssts_doi, ... )


> str(check_sa)
List of 3
 $ metadata:List of 2
  ..$ metadata: chr "Metadata record https://doi.org/10.25845/5b4eb0f9bb848"
  ..$ citation: chr "Australian Institute of Marine Science (AIMS). 2017, AIMS Sea Temperature Observing System (AIMS Temperature Lo"| __truncated__
 $ NA      : NULL
 $ data    :'data.frame': 433 obs. of  22 variables:

 ...
> names(check_sa)
[1] "metadata" NA         "data"    

Has something changed in the underlying API?

This impact the page_data tests too


───────────────────────────────────────────────────────────────────
✓ |   6       | aims_data_doi                                                                                               
✓ |   7       | expose_attributes                                                                                           
✓ |  12       | filter_values [0.9 s]                                                                                       
✓ |  39       | helpers [0.5 s]                                                                                             
x |  29 5     | page_data [4.7 s]                                                                                           
──────────────────────────────────────────────────────────────────────
Failure (test-page_data.R:22:3): Correct structure
`check_sa` has length 4, not length 5.

Failure (test-page_data.R:30:3): Correct structure
`check_sb` has length 2, not length 3.

Failure (test-page_data.R:31:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-page_data.R:36:3): Correct structure
`check_sa_n` has length 2, not length 3.

Failure (test-page_data.R:37:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa_n)) is not TRUE

`actual`:   FALSE
`expected`: TRUE 
─────────────────────────────────────────────────────────────────────

══ Results ═══════════════════════════════════════════════════════════
Duration: 12.3 s

[ FAIL 9 | WARN 0 | SKIP 0 | PASS 123 ]

Response: Hopefully tests should be working now after the API fix?

  • data structures strongly agree with @boshek about restructuring aims_data to returns a data.frame with attributes. Then also having aims_citation and aims_metadata as well to extract those attributes individually.

Response: Agreed. Changes implemented in open-aims/dataaimsr@3ead676

techisdead commented 3 years ago

I've had another look at the updates - Thanks for the organised response and all the work to address our comments. I'm happy that it's all been addressed. The vignettes all work fine, updating the tests avoids exposing and API key and all the functions work as per the vignettes and manuals (unless I've missed something).

I only found one remaining niggle - Tests failed (when invoked from devtools::check() but not devtools:test() which usually points to a namespace issue). This was the error

> test_check("dataaimsr")
══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test-aimsdf-methods.R:38:5): Correct structure ───────────────────────
Error: package rgeos required for finding out which hole belongs to which exterior ring
Backtrace:
     █
  1. ├─base::plot(wa, ptype = "map") test-aimsdf-methods.R:38:4
  2. ├─dataaimsr:::plot.aimsdf(wa, ptype = "map")
  3. │ ├─ne_countries(continent = "oceania", returnclass = "sf") %>% st_transform(crs = 3112)
  4. │ └─rnaturalearth::ne_countries(continent = "oceania", returnclass = "sf")
  5. │   └─rnaturalearth:::ne_as_sf(spdf[filter, ], returnclass)
  6. │     ├─sf::st_as_sf(x)
  7. │     └─sf:::st_as_sf.Spatial(x)
  8. │       ├─sf::st_as_sfc(sp::geometry(x), ...)
  9. │       └─sf:::st_as_sfc.SpatialPolygons(sp::geometry(x), ...)
 10. └─sf::st_transform(., crs = 3112)

The fix for me was adding rgeos to 'Suggests' in the DESCRIPTION file.

Thanks again for this package!

dbarneche commented 3 years ago

Thank you very much @techisdead, that fixed it on the CI as well. Best wishes, Diego

boshek commented 3 years ago

Great responses! This looks great to me. With respect to the README that is certainly fine.

Re pkgdown issue:

I still don't know how to prevent the non-exported functions from appearing in the pkgdown reference page---any ideas?

This is because they are still documented. You can either remove the documentation or use @noRd. But then you'll also need to remove references to that function.

This package looks to be in great shape now. Well done @dbarneche!

ldecicco-USGS commented 3 years ago

@ropensci-review-bot approve

ropensci-review-bot commented 3 years ago

Approved! Thanks @dbarneche for submitting and @boshek Reviewers: @techisdead for your reviews! :grin:

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

dbarneche commented 3 years ago

Hi @ldecicco-USGS , I'm almost done with the transfer. I'm just confused about the re-directing part to create the ropensci docs. Previously the package was hosted under our GitHub organisation account open-AIMS. I then had a look here: https://devguide.ropensci.org/redirect.html But wasn't really clear on how setting up and open-AIMS.github.io repo with a dataaimsr folder would help redirect things given that everything now has been transferred to ropensci? Would it be possible to maintain the current pkgdown GitHub Action and have ropensci docs?

Also is the Appveyor step mandatory?

Sorry about the confusion, and thanks for your guidance!

ldecicco-USGS commented 3 years ago

I think you're all good to go!

It doesn't look to me like you were using Appveyor - so no need to remove it.

My understanding of the redirect step was that it's only important if you already had a pkgdown site somewhere else. If you did and you want to create a re-direct from that original page, I can ask around for clarification.

dbarneche commented 3 years ago

Oh, I see. Thanks for clarifying! Currently pkgdown website is hosted under https://ropensci.github.io/dataaimsr/

I'm assuming that's OK?

Also, how should I proceed with submission to JOSS? Is there a path I should take from rOpenSci? Or do I proceed normally as a regular submission through their system?

boshek commented 3 years ago

One thing @dbarneche to mention. The rOpenSci development server generates nightly builds of your package. So this should work now too:

install.packages("dataaimsr ", repos = "https://dev.ropensci.org")

So you can include that in your README. Here is a good example: https://github.com/ropensci/targets#installation

dbarneche commented 3 years ago

All done @boshek , thanks for the suggestion!

ldecicco-USGS commented 3 years ago

You should proceed with the JOSS submission in the regular way. Somewhere in the submission add a link to this thread though, that will be helpful.