openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
709 stars 38 forks source link

[REVIEW]: rgugik: Search and Retrieve Spatial Data from the Polish Head Office of Geodesy and Cartography in R #2948

Closed whedon closed 3 years ago

whedon commented 3 years ago

Submitting author: @kadyb (Krzysztof Dyba) Repository: https://github.com/kadyb/rgugik Version: v0.2.1 Editor: @kbarnhart Reviewer: @adamhsparks, @mikerspencer Archive: 10.5281/zenodo.4606706

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/72de1e21e7ee7e45e7d7cd8d838158ee"><img src="https://joss.theoj.org/papers/72de1e21e7ee7e45e7d7cd8d838158ee/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/72de1e21e7ee7e45e7d7cd8d838158ee/status.svg)](https://joss.theoj.org/papers/72de1e21e7ee7e45e7d7cd8d838158ee)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@adamhsparks & @mikerspencer, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @kbarnhart know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @adamhsparks

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @mikerspencer

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 3 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @adamhsparks, @mikerspencer it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 3 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.32614/RJ-2018-009 is OK
- 10.2830/63132 is OK

MISSING DOIs

- None

INVALID DOIs

- None
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

whedon commented 3 years ago

:wave: @mikerspencer, please update us on how your review is going.

whedon commented 3 years ago

:wave: @adamhsparks, please update us on how your review is going.

kbarnhart commented 3 years ago

@mikerspencer and @adamhsparks, we have an automatic reminder two weeks into the review. Feel free to provide an update if you have one. And if you have any issues as you work on your reviews, please let me know. Sometimes reviewers have permissions issues with being able to check the checkboxes, if that comes up for you, just let me know and I should be able to fix it.

adamhsparks commented 3 years ago

I have to admit, I'm a bit slow off the mark here and can't remember what I've actually done. If I check this invitation, https://github.com/openjournals/joss-reviews/invitations GitHub says it's expired. Did I manage to accept before that happened?

kbarnhart commented 3 years ago

@whedon re-invite @adamhsparks as reviewer

whedon commented 3 years ago

OK, the reviewer has been re-invited.

@adamhsparks please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

kbarnhart commented 3 years ago

This should be a fresh invite link. Let me know if you have issues with check-boxes after using it.

adamhsparks commented 3 years ago

Ok, that's great. Now the tick boxes work for me. 🙏

kbarnhart commented 3 years ago

@mikerspencer and @adamhsparks, a friendly reminder that we are nearing the end of the six week period in which JOSS requests you complete your reviews. If you have any questions, please let me know.

adamhsparks commented 3 years ago

Thanks for the reminder. This is on my to-do list for today.

adamhsparks commented 3 years ago

Overall the package is well written and I'm impressed with the real-world use-case examples in the vignettes. This will be a great benefit to the community by providing an easy to use resource for a reproducible workflow that centralises data fetching for this data resource. Other tools that offer access to this resource are discussed, along with the advantages of this tool over them.

The contribution guidelines are some of the most clear I've seen. Well done!

I've opened a few issues in the repository for the code or documentation issues I've found or other thoughts.

Here are some more specific comments on the paper and package.

  1. I think that the paper would benefit from some edits for clarity and flow, so I've not ticked that off yet. I found the writing to be mostly clear, all of the content was covered as required, but in some places the language was awkward and hard to follow.
  2. Two functions have zero test coverage, geonames_download() and borders_download(), the reason for this isn't clear to me.
  3. I noted that the test coverage in the README badge, ~87% and paper, don't match what I get locally, ~80%. I'm unsure of the reasons for such a discrepancy?
  4. There are four vignettes in the pkgdown website, but only three in the package as installed from the master branch?
  5. In several places "digital elevation model" is spelled as "Digital Elevation Model", this is not correct, it is not a proper noun.
  6. The package passes all the normal R CMD checks and whatnot and has even used CI to run lintr on the code, which really shows effort put into writing good code. 🙏
kbarnhart commented 3 years ago

@adamhsparks thanks for your review. @kadyb if discussion is necessary regarding any of the comments I'd recommend making an issue in the main repository. Otherwise you can respond here.

@mikerspencer a friendly reminder to provide an update on your review.

Thanks again to both of you for contributing to the JOSS review process.

mikerspencer commented 3 years ago

Thanks @kbarnhart. It's amazing 6 weeks have passed so quickly. My son's school started back today, so I've suddenly got a normal amount of time. I'll work through the review and get it back today or tomorrow.

kbarnhart commented 3 years ago

@mikerspencer thanks for letting me know. At JOSS, we aim to be very understanding that the pandemic has changed people's time availability (e.g. , see this blog post from early in the pandemic). Just keep me posted if you need more time beyond this week.

mikerspencer commented 3 years ago

@kbarnhart I've finished my review. There's a lot to commend, especially how clearly the package is structured and documented.

I've raised a couple of minor issues (61, 62 & 63), but unfortunately I was unable to test functionality due to issue 64. Issue 64 is a connection error when downloading datasets, which consistently occurs at ~15 MB through a download. The issue doesn't affect smaller downloads, e.g. the 2.4 MB files in the automated tests. I've tested at two different times of day and am reasonably sure my internet connection is stable. I'm unsure what to recommend, but have suggested adding larger files to the automated download tests.

kbarnhart commented 3 years ago

@mikerspencer thanks for doing this review.

@kadyb, when you think you have addressed all of the points the reviewers have raised, please ping them and me here so they can re-visit their review checklists.

If you have any questions for me, please tag me on this issue (or in the in-repo issues).

kadyb commented 3 years ago

Hello @adamhsparks, thanks for all the suggestions! Here are my explanations for your review:

I think that the paper would benefit from some edits for clarity and flow, so I've not ticked that off yet. I found the writing to be mostly clear, all of the content was covered as required, but in some places the language was awkward and hard to follow.

We made some text improvements - kadyb/rgugik@9add06b, but we are also interested in your feedback. Do you have any other recommendations on how to improve the text?

Two functions have zero test coverage, geonames_download() and borders_download(), the reason for this isn't clear to me.

These functions are just for data download and do not contain any complex logic. They are not automatically tested because the downloaded files are large (e.g., 375 MB) and server space is limited.

I noted that the test coverage in the README badge, ~87% and paper, don't match what I get locally, ~80%. I'm unsure of the reasons for such a discrepancy?

This is related to the above issue. Three functions are excluded from automatic testing due to large download files sizes, and two functions because IP addresses from GitHub are blocked (but you can still test them locally). Here is the "test coverage" configuration file. Please run this code to get identical results locally:

fun_excl = c("geonames_download", "borders_download")
cov = covr::package_coverage(function_exclusions = fun_excl)
cov
#> rgugik Coverage: 87.48%

There are four vignettes in the pkgdown website, but only three in the package as installed from the master branch?

The "Spatial Databases" vignette is an article containing translated names only (unlike the other vignettes that provide practical examples of searching, retrieving and processing data), so we felt that it should only be available on the package website.

In several places "digital elevation model" is spelled as "Digital Elevation Model", this is not correct, it is not a proper noun.

Fixed in https://github.com/kadyb/rgugik/pull/67.

kadyb commented 3 years ago

@whedon generate pdf

whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kadyb commented 3 years ago

Hello, @mikerspencer thanks for the review and praise! I would like to respond here to the downloading problem you described:

(...) unfortunately I was unable to test functionality due to issue 64. Issue 64 is a connection error when downloading datasets, which consistently occurs at ~15 MB through a download. The issue doesn't affect smaller downloads, e.g. the 2.4 MB files in the automated tests. I've tested at two different times of day and am reasonably sure my internet connection is stable.

Both reviewers confirmed this problem. On our side it is hard to reproduce, but the error is certainly there. We think it rather affects foreign users (we suppose it could be due to lower server priority). Nevertheless, thanks to the very active and helpful attitude of the reviewers, we tested two solutions to prevent this problem:

  1. All download functions in the package use the utils::download.file() backend, which supports different download methods. The default method failed, however reviewers confirmed that changing the method helps to resolve the issue. We have therefore corrected the error returned to clearly inform the user that they can change the method (https://github.com/kadyb/rgugik/pull/68).
  2. In the long run (after review), we will add another independent download tool (curl::curl_download()) as a backend. However, we detected a problem on Windows (it has been reported to the package developer) and it requires more testing to ensure smooth operation (user can test this using the development version of the package).

More information can be found here. We believe that the actions taken are sufficient.

mikerspencer commented 3 years ago

@kadyb good work on the solution. Which version should I test to complete my review? I presume the updated error returned in not yet on CRAN?

kadyb commented 3 years ago

@mikerspencer, yes, you're right. Currently, all changes have been made to the development version on GitHub.

Please use the code below to install the development version (this will allow to test the first proposed solution):

# install.packages("remotes")
remotes::install_github("kadyb/rgugik")

To test the second solution (downloading files via curl package) , please use this:

remotes::install_github("kadyb/rgugik@usecurl")
kbarnhart commented 3 years ago

@kadyb thanks for making these changes in response to reviewer comments.

@adamhsparks and @mikerspencer once you've had a change to evaluate these changes, please let me know if additional issues remain. Thanks again for contributing reviews.

mikerspencer commented 3 years ago

@kbarnhart I've finished my review. It's a really solid piece of work and was a pleasure to review. Well done to @kadyb.

danielskatz commented 3 years ago

@mikerspencer - I didn't realize you were reviewing this submission as well - now I feel bad for bugging you on the other one, but not bad enough that I'm not semi-subtly reminding you about that one too :)

mikerspencer commented 3 years ago

@danielskatz Yes, a little bit of awkward timing that the other one came back at a similar time. Don't worry, I've not forgotten - it's in my home tabs, so I see it each time I open a browser session. I'd like to get it off my desk this week, for my own sanity along with yours and the authors!

kadyb commented 3 years ago

@mikerspencer, thank you very much for your time, all comments and kind words. Good luck with the second review!

adamhsparks commented 3 years ago

Code all looks good. I've added comments in the PDF that I've attached here for my comments on the paper as requested.

10.21105.joss.02948.pdf

kadyb commented 3 years ago

@adamhsparks, thanks. I corrected all these shortcomings in https://github.com/kadyb/rgugik/commit/265692224a7b6398cdec68f7c89cc0c39347ad7c. Test coverage should be ~87% (https://github.com/kadyb/rgugik/commit/ee186247c3fd885bd61034a7fe53a3171b12b3b0).

kbarnhart commented 3 years ago

@adamhsparks and @mikerspencer thanks so much for revisiting your reviews after @kadyb made revisions.

@adamhsparks it looks like there are a couple of remaining check boxes on your list. If these issues have been addressed, please check them off. Otherwise, let @kadyb know what is needed to address them.

adamhsparks commented 3 years ago

@kbarnhart, since @kadyb has just addressed my comments I can now go back and check everything and will update.

adamhsparks commented 3 years ago

The paper's grammar looks good now but the test coverage, both with the badge and with the paper is still an issue. I just pulled the latest and ran covr::report() and got this result of ~80% coverage. Not 87% or 89%?

Screen Shot 2021-03-15 at 10 06 59 am
kadyb commented 3 years ago

geonames_download() and borders_download() functions are just for data download and do not contain any complex logic. They are not automatically tested because the downloaded files are large (e.g., 375 MB) and server space is limited. We use this configuration file to calculate the test coverage value on codecov.io. Please run this code to get identical results locally:

fun_excl = c("geonames_download", "borders_download")
cov = covr::package_coverage(function_exclusions = fun_excl)
cov
#> rgugik Coverage: 87.55%
adamhsparks commented 3 years ago

Ah, thanks for the explanation.

I’m curious, why not use the #nocov tags in these files to exclude coverage if they aren’t/don’t need to be tested?

On 15 Mar 2021, at 6:15 pm, Krzysztof Dyba @.***> wrote:

geonames_download() and borders_download() functions are just for data download and do not contain any complex logic. They are not automatically tested because the downloaded files are large (e.g., 375 MB) and server space is limited. We use this https://github.com/kadyb/rgugik/blob/master/.github/workflows/test-coverage.yml configuration file to calculate the test coverage value on codecov.io https://codecov.io/gh/kadyb/rgugik. Please run this code to get identical results locally:

fun_excl = c("geonames_download", "borders_download") cov = covr::package_coverage(function_exclusions = fun_excl) cov

> rgugik Coverage: 87.55%

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/2948#issuecomment-799297804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYMIAWTLDURIWUB3ZA22Z3TDXM5PANCNFSM4V25BBEQ.

adamhsparks commented 3 years ago

@kbarnhart, @kadyb has done a fantastic job with this. I've ticked off the remaining boxes (my previous question notwithstanding).

kadyb commented 3 years ago

I’m curious, why not use the #nocov tags in these files to exclude coverage if they aren’t/don’t need to be tested?

I could be wrong, but I think the preferred way is to use the function_exclusions argument. However, there are many ways to achieve the effect.

@adamhsparks, thanks for your time and helpful comments!

kadyb commented 3 years ago

@whedon generate pdf

adamhsparks commented 3 years ago

Thanks, in that case I may be doing it wrong! I’ve learned something new doing this review.

On 15 Mar 2021, at 7:00 pm, Krzysztof Dyba @.***> wrote:

I’m curious, why not use the #nocov tags in these files to exclude coverage if they aren’t/don’t need to be tested?

I could be wrong, but I think the preferred way is to use the function_exclusions argument. However, there are many ways to achieve the effect.

@adamhsparks https://github.com/adamhsparks, thanks for your time and helpful comments!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/2948#issuecomment-799326289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYMIAXE345LYAWPDDRMNKLTDXSGFANCNFSM4V25BBEQ.

whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kbarnhart commented 3 years ago

@whedon check references

whedon commented 3 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.32614/RJ-2018-009 is OK
- 10.2830/63132 is OK

MISSING DOIs

- None

INVALID DOIs

- None
kbarnhart commented 3 years ago

@whedon generate pdf

whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kbarnhart commented 3 years ago

@kadyb at this point could you:

I can then move forward with accepting the submission.

kadyb commented 3 years ago
  1. https://github.com/kadyb/rgugik/releases/tag/v0.2.1 (version tag: v0.2.1)
  2. https://zenodo.org/record/4606706
  3. Done.
  4. https://doi.org/10.5281/zenodo.4606705 (10.5281/zenodo.4606706)

If something is wrong, please let me know.

kbarnhart commented 3 years ago

@whedon set v0.2.1 as version

whedon commented 3 years ago

OK. v0.2.1 is the version.