Closed dbarneche closed 3 years ago
Hi @dbarneche ! Thanks for your submission. I'll be the editor for this process. The "fit" for the package and rOpenSci looks great, a very suitable package.
The first step is the "editor checks". Once this is completed, I'll be asking around for reviewers.
The standard devtools::check()
resulted in:
> checking package dependencies ... NOTE
Suggests orphaned package: 'ggmap'
> checking for unstated dependencies in vignettes ... NOTE
'::' or ':::' imports not declared from:
'dplyr' 'tidyr'
0 errors v | 0 warnings v | 2 notes x
A standard tool we use is goodpractice::gp()
, which resulted in:
Preparing: rcmdcheck
── GP dataaimsr ────────────────────────────────────────────────────
It is good practice to
✖ write unit tests for all functions, and all
package code in general. 80% of code lines are covered
by test cases.
R/aims_data.R:169:NA
R/aims_data.R:170:NA
R/aims_data.R:171:NA
R/aims_data.R:176:NA
R/aims_data.R:177:NA
... and 22 more lines
✖ avoid long code lines, it is bad for
readability. Also, many people prefer editor windows
that are about 80 characters wide. Try make your lines
shorter than 80 characters
R\aims_data.R:29:1
✖ fix this R CMD check NOTE: '::' or ':::'
imports not declared from: 'dplyr' 'tidyr'
Coverage looks pretty good:
covr::package_coverage()
dataaimsr Coverage: 80.00%
R/zzz.R: 0.00%
R/aims_data.R: 80.00%
R/page_data.R: 88.24%
R/helpers.R: 90.32%
R/filter_values.R: 90.91%
R/aims_data_doi.R: 100.00%
R/expose_attributes.R: 100.00%
You suggest tidyverse
, which is typically not recommended:
https://www.tidyverse.org/blog/2018/06/tidyverse-not-for-packages/
Because some package installations will install all suggested packages, suggesting tidyverse
could be a major undertaking for your users. Glancing though your vignettes, it looks like you could suggest just dplyr
and tidyr
, and probably purrr
if you consider dropping the plyr::llply
suggestion and replace it with purrr:mapdf
.
On that note, plyr
is a package you'll probably want to avoid if possible. If you are already steering your users towards tidyverse, I think it's not a stretch to drop the plyr
suggestion and move to a purrr
alternative.
Many of the help file (.Rd generated from roxygen) don't have working examples, which is always helpful for users.
Due to the nature of the package, you may want to have a look here for additional testing strategies: https://books.ropensci.org/http-testing/index.html
So with that being said, I'll start the process for finding reviewers, but I'd encourage you to consider the above comments in the meantime.
You can add the rOpenSci 'under review' badge to your README:
You may use rodev::use_review_badge(), rodev::use_review_badge(428). Badge URL is https://badges.ropensci.org/428_status.svg.
Full link should be:
[![](https://badges.ropensci.org/428_status.svg)](https://github.com/ropensci/software-review/issues/428)
Hi @ldecicco-USGS ! Thank you very much for such a prompt and thorough feedback. I'll start working on your suggestions today. [Edit:]: all incorporated, thanks!
I have a parallel question in the meantime as well, if you don't mind. We are hoping to submit a paper to JOSS about our package once (and if) it is properly approved by rOpenSci. Is it OK for us to keep working on the paper while the package is being reviewed here?
Also, the data team at AIMS is also developing a web-based data explorer for the same datasets. It serves as a complementary tool for users who prefer a more interactive form of navigation/exploration. One of the features of the tool is to export an R script based on dataaimsr to download the data type/window chosen by the user. There are other features such as downloading the data in csv or a printing the time series as png.
This is my first time submitting a package to rOpenSci and a paper to JOSS. In your opinion, should we add a section about this complementary tool to the paper, or should it be another standalone article?
Is it OK for us to keep working on the paper while the package is being reviewed here?
Yes very much so. The paper will evolve further in the JOSS review.
In your opinion, should we add a section about this complementary tool to the paper, or should it be another standalone article?
You can certainly mention it. A web based explorer for data does not qualify as research software under JOSS so you can't submit that as a separate paper anyway.
Hey @dbarneche, just letting you know the status. @boshek has graciously offered to be a reviewer (🙏thanks!) and I've been reaching out to others a bit more local to Australia. My favorite excuse so far is "I'd love to but I'm on a voyage around Antarctica at the moment." So, I'm still actively looking for a second reviewer, and just wanted to let you know.
whoops! Certainly didn't mean to close this!
Hello @ldecicco-USGS, thanks very much for letting me know.
@ldecicco-USGS here is my review of the dataaimsr
. Overall a really great package that accesses a valuable data source. I have tried to summarize my key topics below. I think the biggest two point for me are:
aims_data
reliably returns a data.frame and exposes those other elements (citation etc) as attributes. Great job with this package!
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [X] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [X] A short summary describing the high-level functionality of the software
- [X] Authors: A list of authors with their affiliations
- [X] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [x] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing: 8.
aims_data
could be significantly shortened. The examples section is meant to be illustrative rather than comprehensive. But I wouldn't waste all that great code. This is an opinion so not a requirement for acceptance. I would simply move it to a vignette. I am a bit uncertain how you are going to manage with CRAN, the requirement for the API keys and the ability to run examples. Here I think you might need to build in some ability for the function to 'fail gracefully' when the API key isn't available. That is, don't error but instead return an informative message. That same applies to checking if someone has a) internet and b) if the API is down for whatever reason. Again this is particularly relevant if CRAN submission is desired. I think you can wrap them in a try
function. That might work.
Community guidelines: Missing a CONTRIBUTING guide and (though not required) a CODE_OF_CONDUCT file.library(dataaimsr)
library(sf)
library(rnaturalearth)
library(tmap)
# see ?aims_data_doi for argument names
ssts_doi <- aims_data_doi("temp_loggers")
sdata <- aims_data(ssts_doi, api_key = NULL,
summary = "summary-by-series")
oceania_boundary <- ne_countries(continent = "oceania", returnclass = "sf") %>%
st_transform(3112)
sdata_sf <- sdata %>%
dplyr::mutate(cols = cal_obs * 1e-3) %>%
tidyr::drop_na(lon, lat) %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
st_transform(3112)
tm_shape(oceania_boundary) +
tm_polygons() +
tm_shape(sdata_sf) +
tm_dots(col = 'cols', size = 0.1)
aims_data_doi
function. Would it not be more straight forward for a user to simply to expose the target
argument directly to aims_data
and other functions like expose_attributes
?aims_*
prefixes to all exposed functions would be a good idea. handle_error
and page_data
. Those all can be internal functions and don't need to be user facing. aims_data
was always just a data.frame I think that would be beneficial. Instead of returning a list I think you could stored those other elements as attributes attached to the object and then construct extract functions. So something like aims_citation
which takes the object returned by aims_data
and return the citation. Typically returning objects of different classes depending on the value of an argument is considered not best practice.aims_data
object and implementing print
and plot
methods. curl::has_internet()
to fail "gracefully" when lacking an internet connection. find_api_key
would probably work fine. Thanks so much @boshek 🎉!
@dbarneche, feel free to begin your response to this review. I've sent another request out today for reviewer number 2. If I don't hear back soon, I'll do a review myself.
Hey - just jumping in here to say I'd be happy to review. I'm not a regular ROpenSci contributor but an everyday programmer/builder/reviewer of R packages at @SymbolixAU and in Australia (and very interested in ecological open data).
I can't look at it for a couple of days but will put some time aside on the weekend, if that's cool.
Hi @boshek, @ldecicco-USGS and @techisdead, your time and contribution is very much appreciated. I apologise because I have many projects to deal with this week, but I will start working on my response early next week. Best wishes, D
@techisdead , thanks for volunteering!
Our reviewers guide details what we look for in a package review and includes links to example reviews. Our standards are detailed in our packaging guide, and we provide reviewer template for you to use. Please make sure you do not have a conflict of interest preventing you from reviewing this package. If you have questions or feedback, feel free to let me know.
We ask reviewers to complete reviews in three weeks, so if you were to finish a review this weekend, that would be fantastic.
Great - thanks!
@techisdead , let me know if you have any questions on the review. Feel free to email me directly as well (usually that is how I would initiate the conversation, but since you volunteered - I don't have your email).
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [x] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [x] A short summary describing the high-level functionality of the software
- [x] Authors: A list of authors with their affiliations
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing: 4
Great package and data sets. I'm excited to see how the API evolves and think it's great this package is being developed alongside.
Overall I agree with the observations made by @ldecicco-USGS and @boshek (exept I liked the long README :-) )
I found the package well documented and easy to use on the whole. I ran into an issue with tests that hadn't been noted previously - perhaps something has changed on the server side ? Happy to help figure it out.
More details follow:
Also devtools::test()
failed for me.
> devtools::test()
Loading dataaimsr
Testing dataaimsr
✓ | OK F W S | Context
x | 30 4 | aims_data [6.1 s]
─────────────────────────────────────────────────────────
Failure (test-aims_data.R:16:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa)) is not TRUE
`actual`: FALSE
`expected`: TRUE
Failure (test-aims_data.R:22:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE
`actual`: FALSE
`expected`: TRUE
Failure (test-aims_data.R:41:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sc)) is not TRUE
`actual`: FALSE
`expected`: TRUE
Failure (test-aims_data.R:45:3): Wrong filters
all(c("metadata", "citation", "data") %in% names(check_sd)) is not TRUE
`actual`: FALSE
`expected`: TRUE
This is because the structure returnes from the two endpoints is different
aims_data(weather_doi, ...)
returns data of the form
> str(check_wa)
List of 3
$ metadata: chr "Metadata record https://doi.org/10.25845/5c09bf93f315d"
$ citation: chr "Australian Institute of Marine Science (AIMS). 2009, Australian Institute of Marine Science Automatic Weather S"| __truncated__
$ data :'data.frame': 1297 obs. of 14 variables:
...
> names(check_wa)
[1] "metadata" "citation" "data"
Whereas citation
is nested under $metadata
for aims_data(ssts_doi, ... )
> str(check_sa)
List of 3
$ metadata:List of 2
..$ metadata: chr "Metadata record https://doi.org/10.25845/5b4eb0f9bb848"
..$ citation: chr "Australian Institute of Marine Science (AIMS). 2017, AIMS Sea Temperature Observing System (AIMS Temperature Lo"| __truncated__
$ NA : NULL
$ data :'data.frame': 433 obs. of 22 variables:
...
> names(check_sa)
[1] "metadata" NA "data"
Has something changed in the underlying API?
This impact the page_data
tests too
───────────────────────────────────────────────────────────────────
✓ | 6 | aims_data_doi
✓ | 7 | expose_attributes
✓ | 12 | filter_values [0.9 s]
✓ | 39 | helpers [0.5 s]
x | 29 5 | page_data [4.7 s]
──────────────────────────────────────────────────────────────────────
Failure (test-page_data.R:22:3): Correct structure
`check_sa` has length 4, not length 5.
Failure (test-page_data.R:30:3): Correct structure
`check_sb` has length 2, not length 3.
Failure (test-page_data.R:31:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE
`actual`: FALSE
`expected`: TRUE
Failure (test-page_data.R:36:3): Correct structure
`check_sa_n` has length 2, not length 3.
Failure (test-page_data.R:37:3): Correct structure
all(c("metadata", "citation", "data") %in% names(check_sa_n)) is not TRUE
`actual`: FALSE
`expected`: TRUE
─────────────────────────────────────────────────────────────────────
══ Results ═══════════════════════════════════════════════════════════
Duration: 12.3 s
[ FAIL 9 | WARN 0 | SKIP 0 | PASS 123 ]
aims_data
to returns a data.frame
with attributes. Then also having aims_citation
and aims_metadata
as well to extract those attributes individually.Thanks so much @techisdead ! @dbarneche next step is a response to the reviewers.
Hi @ldecicco-USGS and @techisdead, Thanks very much for this. It is on the top of my TODO list for this week. Best wishes, Diego
I would like to thank the editor @ldecicco-USGS and both referees @boshek and @techisdead for dedicating their time to reviewing our package. I have tried my best to incorporate all suggestions raised. Changes have been summarised in NEWS.md; below I provide responses with the corresponding commit links. I look forward to your response. Best wishes, Diego
@ldecicco-USGS here is my review of the
dataaimsr
. Overall a really great package that accesses a valuable data source. I have tried to summarize my key topics below. I think the biggest two point for me are:
- Spend some additional time to make sure the API keys and the presence of an internet connections are checked for both users and for CI/CRAN services. This will help you debug future issues and provides some nice guardrails for users.
- I'd advocate strongly that
aims_data
reliably returns a data.frame and exposes those other elements (citation etc) as attributes.Great job with this package!
Response: Thank you very much @boshek. I have tried my best to incorporate almost all of your excellent suggestions. There were only two instances where I diverged in opinion, below I provide a point-by-point response to each comment.
Package Review
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
- [ ] Installation instructions: for the development version of package and any non-standard dependencies in README
- [x] Vignette(s) demonstrating major functionality that runs successfully locally
- [ ] Function Documentation: for all exported functions
- [ ] Examples (that run successfully locally) for all exported functions
- [ ] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL
,BugReports
andMaintainer
(which may be autogenerated viaAuthors@R
).For packages co-submitting to JOSS
- [x] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [x] A short summary describing the high-level functionality of the software
- [x] Authors: A list of authors with their affiliations
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software been confirmed.
- [x] Performance: Any performance claims of the software been confirmed.
- [x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
Estimated hours spent reviewing: 8.
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Response: You are now officially listed as a package reviewer together with @ldecicco-USGS and @techisdead. Change implemented in open-aims/dataaimsr@c4a26fb
.
Review Comments
Documentation
- Statement of Need: I think that the README could actually be made shorter. I think it would be helpful to outline why a user would want to use R to interact with this data. To me the specific value add here is that you are connecting the AIMS api with all the amazing capabilities of R. I think the text in the Overview section is valuable. I just wonder if it could be condensed a little to get users to specifically to the installation instructions as really what they are after here is the software itself.
Response: I appreciate this suggestion. I have expanded the overview section by a few sentences to better highlight your comment. However, in case it's still OK with you, I'd prefer to maintain the blurbs about the datasets on the README. That is because I suspect that many users do not even know about the existence of these datasets, and so having a summary about them should help break that barrier. That might have been why @techisdead appreciated the longer README? I do take your point though that users may be looking for installation instructions above anything else, so I have moved those two paragraphs to the end of README, after the installation instructions and usage examples. I hope this serves as a good compromise, but please do let me know in case you feel strongly about this. Changes implemented in open-aims/dataaimsr@11dbdf2
.
- Installation instructions: I would revamp the installation instructions so that getting the API key is explictly a step in the installation instructions. I would also include information about how the API key should be kept secret. You might consider linking to this article.
Response: Excellent suggestion. Incorporated in open-aims/dataaimsr@27aa4c7
.
- Vignette(s): See comment in API keys section. I don't think ggmap is a particularly useful here. Given ggmaps' own API requirements I would strongly advise against including it here as a dependency and including it in the vignettes. If you really want a map, I would suggest try an sf approach. But I think there are plenty of tutorials around to tackle mapping in R and it detracts from the core focus of dataaimsr.
Response: Thanks for this suggestion. I built on the example code you kindly provided below for the maps to create a plot method (please see more about this below). The consequence of that change though is that ggmap
is no longer a requirement for the vignettes nor a dependency of the package. Change implemented in open-aims/dataaimsr@557f41a
.
- Examples: I think the example in
aims_data
could be significantly shortened. The examples section is meant to be illustrative rather than comprehensive. But I wouldn't waste all that great code. This is an opinion so not a requirement for acceptance. I would simply move it to a vignette. I am a bit uncertain how you are going to manage with CRAN, the requirement for the API keys and the ability to run examples. Here I think you might need to build in some ability for the function to 'fail gracefully' when the API key isn't available. That is, don't error but instead return an informative message. That same applies to checking if someone has a) internet and b) if the API is down for whatever reason. Again this is particularly relevant if CRAN submission is desired. I think you can wrap them in atry
function. That might work.
Response: Thanks for this suggestion. I hope this is OK, but I would prefer to keep the extended list of examples. I think many people might be a bit lost when first using the package---for the same reasons raised in the comment RE: REAMDE above---so those examples should help them better understand the capabilities of the package. As for the failures, thank you! I have implemented two steps to fail the function gracefully. One is related to not having internet connection (as per your suggestion below using curl::has_internet()
), and the other happens in case the input filter values are incorrect. Both implemented in open-aims/dataaimsr@3ead676
.
I'm unsure though if I should also implement something along these lines for
the internal function handle_error
?
Community guidelines: Missing a CONTRIBUTING guide and (though not required) a CODE_OF_CONDUCT file.
Response: Done in open-aims/dataaimsr@3f5c9b1
.
Paper
- I think it is worth adopting stronger language about the APi keys in the paper to ensure that people don't expose those publicly.
Response: Agreed 100%. Changes implemented in open-aims/dataaimsr@27aa4c7
for the README.md
section, and in open-aims/dataaimsr@557f41a
for the vignettes.
- I would modify the map presented in the paper. This isn't a scientific map. Either I would remove it or I would provide something that use open data. Something like this roughed up example would work:
library(dataaimsr) library(sf) library(rnaturalearth) library(tmap) # see ?aims_data_doi for argument names ssts_doi <- aims_data_doi("temp_loggers") sdata <- aims_data(ssts_doi, api_key = NULL, summary = "summary-by-series") oceania_boundary <- ne_countries(continent = "oceania", returnclass = "sf") %>% st_transform(3112) sdata_sf <- sdata %>% dplyr::mutate(cols = cal_obs * 1e-3) %>% tidyr::drop_na(lon, lat) %>% st_as_sf(coords = c("lon", "lat"), crs = 4326) %>% st_transform(3112) tm_shape(oceania_boundary) + tm_polygons() + tm_shape(sdata_sf) + tm_dots(col = 'cols', size = 0.1)
Response: Thank you very much for your suggestions. I hope you don't mind, but I took the liberty to build on the above code not only to change the paper but also to create a plot method for the new class (more on this below). We have also 1) updated the list of authors, 2) added a blurb about the sister web visualisation tool, and 3) added references and text for the quality control calibration routine. Changes implemented in open-aims/dataaimsr@2b5cd66
.
General
- I think that your R dependency could be 3.3.0 rather than 3.6.0.
Response: This dependency has been updated in open-aims/dataaimsr@c4a26fb
.
- I am wondering about the usefulness of the
aims_data_doi
function. Would it not be more straight forward for a user to simply to expose thetarget
argument directly toaims_data
and other functions likeexpose_attributes
?
Response: I agree that aims_data_doi
was not that useful. Your suggested changes have been fully incorporated in open-aims/dataaimsr@a837afd
.
- I would suggest that adding
aims_*
prefixes to all exposed functions would be a good idea.
Response: Done in open-aims/dataaimsr@ec8f85c
.
- I think you reconsider exposing most of the functions that handle http requests. That would things like
handle_error
andpage_data
. Those all can be internal functions and don't need to be user facing.
Response: Perfect. Now the only exported functions are:
> library(dataaimsr)
> dataaimsr::
dataaimsr::aims_citation dataaimsr::aims_expose_attributes dataaimsr::aims_metadata dataaimsr::is_aimsdf
dataaimsr::aims_data dataaimsr::aims_filter_values dataaimsr::aims_parameters
Change implemented in open-aims/dataaimsr@d32bc86
. I still don't know how to prevent the non-exported functions from appearing in the pkgdown reference page---any ideas?
- If the object returned by
aims_data
was always just a data.frame I think that would be beneficial. Instead of returning a list I think you could stored those other elements as attributes attached to the object and then construct extract functions. So something likeaims_citation
which takes the object returned byaims_data
and return the citation. Typically returning objects of different classes depending on the value of an argument is considered not best practice.- I think this approach could be extended to parameters
- I don't consider this required for acceptance but I would highly recommend considering adding classes to the
aims_data
object and implementingplot
methods.
Response: I hope you don't mind me amalgamating all these excellent suggestions for a single answer, but basically I have implemented all of them. aims_data
now returns a data.frame
of class aimsdf
with 5 attributes, 3 of which can be extracted using the convenience functions aims_citation
, aims_metadata
and aims_parameters
. The other two attributes, type
and target
were added to facilitate plotting methods. plot
, summary
, print
have been extended to work with class aimsdf
. The plot
can be either a time series or a map. These changes were implemented in open-aims/dataaimsr@3ead676
, open-aims/dataaimsr@799e4c2
and open-aims/dataaimsr@1330e0a
.
- Add something like
curl::has_internet()
to fail "gracefully" when lacking an internet connection.
Response: As mentioned above, done in open-aims/dataaimsr@3ead676
.
- I think the README could benefit from a short usage example.
Response: Done in open-aims/dataaimsr@f2af2a3
.
- In the DESCRIPTION I think the url could also include a link to the API. They just need to be separated by a comma.
Response: Done in open-aims/dataaimsr@c4a26fb
.
API keys
- I think that it would be worthwhile in include a check for the env variable when the package is loaded. Using
find_api_key
would probably work fine.Exclude the actual API from the RMD's. There are few ways around actually needing to include it. You could precompile your vignettes like this:
- https://github.com/bcgov/bcdata/blob/master/vignettes/precompile.R
- Or you can set an phantom environmental variable which results in nothing actually being run on CRAN: https://github.com/ropensci/tidyhydat/blob/2c177aec31d5e8fe296b5336b786b01bb83841c9/vignettes/tidyhydat_an_introduction.Rmd#L18
Response: Thanks for these examples. I decided to implement the pre-compilation trick suggested. I would appreciate it if you could confirm whether the way I adapted the original precompile.R
file is properly carrying over the original license. Changes in open-aims/dataaimsr@15974b4
.
- For testing, I think the best approach if you want to test directly on the API is to add a token as GitHub secret. However ropensci does have a suite mocking tools that facilitate exactly this type of thing. For example: https://github.com/ropensci/webmockr
Response: Thanks for raising this issue. I have also removed the key from the test suite. I decided to adopt the httptest
mock approach as suggested below by @techisdead. Changes in open-aims/dataaimsr@f252bdd
.
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
- Briefly describe any working relationship you have (had) with the package authors.
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
- [x] Installation instructions: for the development version of package and any non-standard dependencies in README
- [x] Vignette(s) demonstrating major functionality that runs successfully locally
- [x] Function Documentation: for all exported functions
- [x] Examples (that run successfully locally) for all exported functions
- [x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL
,BugReports
andMaintainer
(which may be autogenerated viaAuthors@R
).For packages co-submitting to JOSS
- [x] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [x] A short summary describing the high-level functionality of the software
- [x] Authors: A list of authors with their affiliations
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software been confirmed.
- [x] Performance: Any performance claims of the software been confirmed.
- [ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
Estimated hours spent reviewing: 4
- [x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Response: Thank you very much your review @techisdead. You are now officially listed as a package reviewer together with @ldecicco-USGS and @boshek. Change implemented in open-aims/dataaimsr@c4a26fb
.
Review Comments
Great package and data sets. I'm excited to see how the API evolves and think it's great this package is being developed alongside.
Overall I agree with the observations made by @ldecicco-USGS and @boshek (exept I liked the long README :-) )
I found the package well documented and easy to use on the whole. I ran into an issue with tests that hadn't been noted previously - perhaps something has changed on the server side ? Happy to help figure it out.
Response: I'm glad you liked the REAMDE page! I apologise for the issue with the tests. It was indeed a hiccup on the API side of things, but it has been fixed already. Tests (on my machine) now all pass, although I am having trouble getting them to work via GitHub Actions as well---I think it is an artefact of needing sf
and rgdal
external libraries and source installation for plotting. Should be fixed soon I hope. As per my answer to @boshek above, I have moved the paragraphs about the datasets to the end of README, after the installation instructions and usage examples; this way user can jump straight onto installation instructions if that's what they're after. Changes implemented in open-aims/dataaimsr@11dbdf2
.
More details follow:
- Tests: I agree with @boshek about not exposing an api key in your tests. I've used {httptest} to mock an api locally and it works well, but using github secrets or {webmockr} would be good choices too
Also
devtools::test()
failed for me.> devtools::test() Loading dataaimsr Testing dataaimsr ✓ | OK F W S | Context x | 30 4 | aims_data [6.1 s] ───────────────────────────────────────────────────────── Failure (test-aims_data.R:16:3): Correct structure all(c("metadata", "citation", "data") %in% names(check_sa)) is not TRUE `actual`: FALSE `expected`: TRUE Failure (test-aims_data.R:22:3): Correct structure all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE `actual`: FALSE `expected`: TRUE Failure (test-aims_data.R:41:3): Wrong filters all(c("metadata", "citation", "data") %in% names(check_sc)) is not TRUE `actual`: FALSE `expected`: TRUE Failure (test-aims_data.R:45:3): Wrong filters all(c("metadata", "citation", "data") %in% names(check_sd)) is not TRUE `actual`: FALSE `expected`: TRUE
This is because the structure returnes from the two endpoints is different
aims_data(weather_doi, ...)
returns data of the form> str(check_wa) List of 3 $ metadata: chr "Metadata record https://doi.org/10.25845/5c09bf93f315d" $ citation: chr "Australian Institute of Marine Science (AIMS). 2009, Australian Institute of Marine Science Automatic Weather S"| __truncated__ $ data :'data.frame': 1297 obs. of 14 variables: ...
> names(check_wa) [1] "metadata" "citation" "data"
Whereas
citation
is nested under$metadata
foraims_data(ssts_doi, ... )
> str(check_sa) List of 3 $ metadata:List of 2 ..$ metadata: chr "Metadata record https://doi.org/10.25845/5b4eb0f9bb848" ..$ citation: chr "Australian Institute of Marine Science (AIMS). 2017, AIMS Sea Temperature Observing System (AIMS Temperature Lo"| __truncated__ $ NA : NULL $ data :'data.frame': 433 obs. of 22 variables: ...
> names(check_sa) [1] "metadata" NA "data"
Has something changed in the underlying API?
This impact the
page_data
tests too─────────────────────────────────────────────────────────────────── ✓ | 6 | aims_data_doi ✓ | 7 | expose_attributes ✓ | 12 | filter_values [0.9 s] ✓ | 39 | helpers [0.5 s] x | 29 5 | page_data [4.7 s] ────────────────────────────────────────────────────────────────────── Failure (test-page_data.R:22:3): Correct structure `check_sa` has length 4, not length 5. Failure (test-page_data.R:30:3): Correct structure `check_sb` has length 2, not length 3. Failure (test-page_data.R:31:3): Correct structure all(c("metadata", "citation", "data") %in% names(check_sb)) is not TRUE `actual`: FALSE `expected`: TRUE Failure (test-page_data.R:36:3): Correct structure `check_sa_n` has length 2, not length 3. Failure (test-page_data.R:37:3): Correct structure all(c("metadata", "citation", "data") %in% names(check_sa_n)) is not TRUE `actual`: FALSE `expected`: TRUE ───────────────────────────────────────────────────────────────────── ══ Results ═══════════════════════════════════════════════════════════ Duration: 12.3 s [ FAIL 9 | WARN 0 | SKIP 0 | PASS 123 ]
Response: Hopefully tests should be working now after the API fix?
- data structures strongly agree with @boshek about restructuring
aims_data
to returns adata.frame
with attributes. Then also havingaims_citation
andaims_metadata
as well to extract those attributes individually.
Response: Agreed. Changes implemented in open-aims/dataaimsr@3ead676
I've had another look at the updates - Thanks for the organised response and all the work to address our comments. I'm happy that it's all been addressed. The vignettes all work fine, updating the tests avoids exposing and API key and all the functions work as per the vignettes and manuals (unless I've missed something).
I only found one remaining niggle - Tests failed (when invoked from devtools::check()
but not devtools:test()
which usually points to a namespace issue). This was the error
> test_check("dataaimsr")
══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test-aimsdf-methods.R:38:5): Correct structure ───────────────────────
Error: package rgeos required for finding out which hole belongs to which exterior ring
Backtrace:
█
1. ├─base::plot(wa, ptype = "map") test-aimsdf-methods.R:38:4
2. ├─dataaimsr:::plot.aimsdf(wa, ptype = "map")
3. │ ├─ne_countries(continent = "oceania", returnclass = "sf") %>% st_transform(crs = 3112)
4. │ └─rnaturalearth::ne_countries(continent = "oceania", returnclass = "sf")
5. │ └─rnaturalearth:::ne_as_sf(spdf[filter, ], returnclass)
6. │ ├─sf::st_as_sf(x)
7. │ └─sf:::st_as_sf.Spatial(x)
8. │ ├─sf::st_as_sfc(sp::geometry(x), ...)
9. │ └─sf:::st_as_sfc.SpatialPolygons(sp::geometry(x), ...)
10. └─sf::st_transform(., crs = 3112)
The fix for me was adding rgeos
to 'Suggests' in the DESCRIPTION file.
Thanks again for this package!
Thank you very much @techisdead, that fixed it on the CI as well. Best wishes, Diego
Great responses! This looks great to me. With respect to the README that is certainly fine.
Re pkgdown issue:
I still don't know how to prevent the non-exported functions from appearing in the pkgdown reference page---any ideas?
This is because they are still documented. You can either remove the documentation or use @noRd
. But then you'll also need to remove references to that function.
This package looks to be in great shape now. Well done @dbarneche!
@ropensci-review-bot approve
Approved! Thanks @dbarneche for submitting and @boshek Reviewers: @techisdead for your reviews! :grin:
To-dos:
pkgdown
website and are ok relying only on rOpenSci central docs building and branding,
pkgdown
website with a redirecting pagehttps://docs.ropensci.org/package_name
URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
. If Appveyor does not pick up new commits after transfer, you might need to delete and re-create the Appveyor project. (Repo transfers are smoother with GitHub Actions)codemetar::write_codemeta()
in the root of your package.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this here.
Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.
We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
Hi @ldecicco-USGS , I'm almost done with the transfer. I'm just confused about the re-directing part to create the ropensci docs. Previously the package was hosted under our GitHub organisation account open-AIMS
. I then had a look here:
https://devguide.ropensci.org/redirect.html
But wasn't really clear on how setting up and open-AIMS.github.io
repo with a dataaimsr
folder would help redirect things given that everything now has been transferred to ropensci
? Would it be possible to maintain the current pkgdown GitHub Action and have ropensci docs?
Also is the Appveyor step mandatory?
Sorry about the confusion, and thanks for your guidance!
I think you're all good to go!
It doesn't look to me like you were using Appveyor - so no need to remove it.
My understanding of the redirect step was that it's only important if you already had a pkgdown site somewhere else. If you did and you want to create a re-direct from that original page, I can ask around for clarification.
Oh, I see. Thanks for clarifying! Currently pkgdown website is hosted under https://ropensci.github.io/dataaimsr/
I'm assuming that's OK?
Also, how should I proceed with submission to JOSS? Is there a path I should take from rOpenSci? Or do I proceed normally as a regular submission through their system?
One thing @dbarneche to mention. The rOpenSci development server generates nightly builds of your package. So this should work now too:
install.packages("dataaimsr ", repos = "https://dev.ropensci.org")
So you can include that in your README. Here is a good example: https://github.com/ropensci/targets#installation
All done @boshek , thanks for the suggestion!
You should proceed with the JOSS submission in the regular way. Somewhere in the submission add a link to this thread though, that will be helpful.
Submitting Author: Diego R. Barneche (@dbarneche)
Other Authors: AIMS Datacentre team: (Jeffrey L. Sheehan, Greg Coleman, Mark Rehbein) Repository: https://github.com/open-AIMS/dataaimsr Version submitted: 1.0.2 Editor: @ldecicco-USGS Reviewers: @boshek Reviewers: @techisdead Archive: TBD
Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
dataaimsr
is an R package written to provide open access to decades of field measurements of atmospheric and oceanographic parameters around the coast of Australia, conducted by the Australian Institute of Marine Science (AIMS). The package communicates with the recently-developed AIMS Data Platform API via an API key.Who is the target audience and what are scientific applications of this package?
The data retrieval functions within
dataaimsr
are designed to be useful to any researcher interested in long-term, instrument-based marine environmental data from tropical Australia.Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category? No because the datasets have been always maintained internally at AIMS. This is the first attempt of the institute to make it broadly available via an R package.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? N/A
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted. N/A
Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
[X] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct