Closed mbjoseph closed 6 years ago
Thanks for you submission, @mbjoseph! This fits well within our scope.
goodpractice::gp()
checks are pretty clean! Please add relevant links to your description and check that long code lines are neccessary:
── GP smapr ────────────────────────────────────
It is good practice to
✖ write unit tests for all functions, and all package code in
general. 6% of code lines are covered by test cases.
R/download_smap.R:32:NA
R/download_smap.R:33:NA
R/download_smap.R:34:NA
R/download_smap.R:35:NA
R/download_smap.R:36:NA
... and 292 more lines
✖ add a "URL" field to DESCRIPTION. It helps users find information
about your package online. If your package does not have a
homepage, add an URL to GitHub, or the CRAN package package page.
✖ add a "BugReports" field to DESCRIPTION, and point it to a bug
tracker. Many online code hosting services provide bug trackers for
free, https://github.com, https://gitlab.com, etc.
✖ avoid long code lines, it is bad for readability. Also, many people
prefer editor windows that are about 80 characters wide. Try make
your lines shorter than 80 characters
tests/testthat/test-extract_smap.R:117:1
tests/testthat/test-find_smap.R:10:1
tests/testthat/test-find_smap.R:20:1
───────────────────────────────────────────────
One real spelling error from spelling::spell_check_package()
WORD FOUND IN
availble smapr-intro.Rmd:36
A note: There's a poorly documented CRAN feature for including bioconductor packages using a biocViews:
field in DESCRIPTION, which you should add to ease installation for users. See here.
I'll go ahead and seek reviewers.
Reviewers: @ldecicco-USGS @marcosci Due date: 2008-07-19
Reviewers @ldecicco-USGS and @marcosci assigned. Due date 2008-07-19. Note our review guidelines have recently been consolidated in our development guide.
@mbjoseph You can now add a review badge to your package README:
[![](https://badges.ropensci.org/231_status.svg)](https://github.com/ropensci/onboarding/issues/231)
Great - thanks Noam!
Hi @marcosci and @ldecicco-USGS! Just a friendly reminder that your reviews are due in 1 week.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing:
2.5-3 hrs
I really like smapr - great work! It was a very pleasant review, the package works like a charm and did exactly what it promised. I see clear value in the package and think it will be a great contribution to the geospatial ecoystem of rOpenSci.
The package documentation is comprehensive and well written, the code is neat and the authors provide rigorous unit testing. Overall, the package is in a good shape, so the following comments are addressing minor things that could be improved.
I tried the installation on a red hat and ubuntu linux machine - no problem, either with CRAN or github. With @noamross hint on biocViews
the installation is now also quite smooth, since you don't need to install the rhdf5 first. Remember to remove this section ofter submtting the next version to CRAN.
Nevertheless, for users new to UNIX systems it might be worth stating that one needs to install local libraries for httr to work.
A short hint on libssl-dev
for debian system and openssl@1.1
for brew on maxOS would be sufficient.
I think it would be worth the effort to have one or two sentences more after:
An R package for acquisition and processing of NASA (Soil Moisture Active-Passive) SMAP data
... stating what data you can retrieve, so that users directly know something about SMAP without following the link.
Maybe move the docker part under installation (as it is an alternative way for that)?
Is there a reason why don't use it? It wouldn't be a necessity from my side, but I think it's worth the effort as navigating smapr would be way easier than through the pdf documentation.
While you provide good examples on how to use the functions in smapr, I still think it would be worth to give some examples/links/... on how to further work with the data retrieved from your package.
I can just speak for the field I am working in, but I think a lot of people there would like to crop the resulting data in the end. So maybe start something in the line of:
us <- raster::getData("GADM", country="USA", level=1)
california <- subset(us, NAME_1 == "California")
california <- sp::spTransform(california, proj4string(sm_raster))
sm_crop <- raster::crop(sm_raster, raster::extent(california))
sm_mask <- raster::mask(sm_crop , california)
raster::plot(sm_mask)
I would consider this whole point a bonus, but when I think back starting with spatial data in R was a lot of googling how to reproject, crop, etc.
Something for lazy people like me:
set_credentials <- function(ed_un, ed_pw){
write(paste0("Sys.setenv(ed_un = \"",ed_un,"\", ed_pw = \"", ed_pw, "\")"),
file=file.path(Sys.getenv("HOME"), ".Rprofile"),append=TRUE)
}
An auxiliary function like that would help to simplify the need for setting the oauth credentials. Plus, new users don't have to locate .Rprofile and/or one can set it easily on remote machines.
Just to bring that on your radar: https://github.com/r-spatial/stars/. If it hits CRAN I think this would be a valuable replacement for rhdf5 and raster in your package, as it could streamline the whole handling process of SMAP data in general.
This also came up in our submission here and I only found an issue discussing sf over sp for packages onboarding rOpenSci. But in your case I think this is more relevant than it was for us.
To save @noamross some work:
codemetar::write_codemeta()
for smapr to create a CodeMeta file.That's a rather short review :open_mouth: ... I would blame the quality of the package :wink: Going through the reviewing guidelines, there is nothing I haven't stated here that could be improved. The code style is very clear, there are no duplications and getting to know the package as a user was pleasant. Other than donating money to NASA for faster servers and hence faster downloading speed, I do not see any possibilities to improve the speed of the functions. The functions and variables all follow rOpenSci guidelines.
Thanks for your review, @marcosci! @mbjoseph, feel free to wait for a second review before addressing comments.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
The description is clear of what the package does. It wouldn't hurt to add a sentence or two right at the beginning expanding on the type of data and type of processing.
My eyes swept over the Bioconductor instructions, so at first, it took me a bit to figure out how to install the rhdf5
package. Could you add the code directly?
source("https://bioconductor.org/biocLite.R")
biocLite("rhdf5")
Before Finding the data, it would probably be a good idea to have a short section on going to the NASA page to get a user name and password. The error message however was very useful, so I knew exactly what to do.
When I ran:
available_data <- find_smap(id = 'SPL4SMAU', dates = '2018-06-01', version = 3)
Error in rvest::html_table(nodes)[[1]] : subscript out of bounds
Debuging, I got to this line:
top_level_response <- GET(path, auth())
and it appeared to be a 401. I eventually realized I had typed the wrong password. I think you could check on that response and make the error message there more clear. Once I got that straighted up, it was off to the races...I thought.
The download_smap
function took a long time. I'd add a quick sentence in the vignette to give the user an idea on how long it might take (or how big the files are). Any indicator messages would be nice here too (am I stuck? did it start? did it get through some of those available_data?).
The README explains what id to use, but the vignette does not. I'm not a big fan of copy/paste, but it's easier to access the vignette once I'm already in R (compared to the readme). Could you add that "Dataset id" table (with the id's) so I can think about looking for more data?
The vignette provided exactly the information needed to run the functions in the package. That being said...I was really curious what I could do next. Could you show how to filter the data into some region?
In the find_smap
arguments, "id" gives one example. It would be useful to say something like "See Details for a description of all ids" (or something like that....)
The examples worked!
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 2
Nice package!
Do you have a connection to the data? I'm going to guess you'll get lots of questions on the data itself and why the services are down (if they go down!). Those requests usually are phrased "smapr is broken!!!!".
No matter what your relationship is, it would be a good idea to make sure you the data providers know this package is out there. They may be interested in having you add a "user agent" for example, so they (and you) can get an idea of how often data is downloaded from the smapr
package.
Otherwise, the code looks nice, the style is consistent, the test coverage is excellent.
Thanks @marcosci and @ldecicco-USGS for your reviews! This is all great feedback.
Thanks your review @ldecicco-USGS! @mbjoseph, if you use issues in your own repo to track how you address the reviews, please summarize the changes here when you are done.
Will do @noamross - thanks!
Thanks again @marcosci, @ldecicco-USGS, and @noamross for your time and reviews. Here are the point-by-point changes:
smapr::set_smap_credentials()
function to make authentication easier https://github.com/earthlab/smapr/issues/41Thanks for pointing out the stars package! I'll keep my eye on it as a potential replacement for rhdf5 and raster.
My eyes swept over the Bioconductor instructions, so at first, it took me a bit to figure out how to install the rhdf5 package. Could you add the code directly?
Shoot! Fortunately this shouldn't be needed anymore with the biocViews()
edit that @noamross suggested.
Added a section to vignette to point people to NASA's Earthdata portal to get a username and password, with instructions to use set_smap_credentials
https://github.com/earthlab/smapr/commit/410f3906175fb479eb8bd92409789cb580050463
Added a function to check for 401 errors and return a helpful error message https://github.com/earthlab/smapr/issues/43
Added verbose
argument to download_smap
that prints when files are being downloaded https://github.com/earthlab/smapr/issues/44
Added note in vignette that downloading data may take a while https://github.com/earthlab/smapr/issues/42
Added dataset ID table to vignette (from README) https://github.com/earthlab/smapr/issues/42
Added some example downstream applications to vignette including cropping/masking to a spatial region, aggregating, and working with the actual values from the rasters to compare surface vs. rootzone soil moisture https://github.com/earthlab/smapr/issues/42
Added pointer to Details section of find_smap
help page for users to see all dataset IDs https://github.com/earthlab/smapr/commit/3b68bf560a6ff545000deeb25135a5dbcc293bbd
Do you have a connection to the data?
Not really - other that we work in the same city! The National Snow and Ice Data Center (which hosts the data) is aware of the package, however.
Sorry - just accidentally hit "Close and comment"! The above summarizes the changes that were made. I really appreciate everyone's help with this - the package is much improved now thanks to these detailed (and friendly :smiley:) reviews!
Thanks, @mbjoseph! @marcosci and @ldecicco-USGS, please let us know if Max's changes address your comments.
They sure do. Great job @mbjoseph !
Superb @mbjoseph ! Just had a look at everything, looks perfect! Well done.
Quick note from a final set of package checks, @mbjoseph: Please add .github
to your .Rbuildignore
Once you've done that, approved! Thanks for submitting and @marcosci and @ldecicco-USGS for your reviews!
To-dos:
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
"[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
.Should you want to awknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this here.
Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing.
We've started putting together a gitbook with our best practices and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
Great - thanks @noamross! I have updated the .Rbuildignore file (https://github.com/ropensci/smapr/commit/9a6e31c1774120dda193229ed94ef4dcbcac0257), transferred the repo to the ropensci org, added the footer (https://github.com/ropensci/smapr/commit/a8f9198ba70aac7a0e69ad12e096d702bdea2b5f), updated badges (https://github.com/ropensci/smapr/commit/acfe30056817bbb562d5e728dccc3aaab63e9b0b), and updated the Codemeta file (https://github.com/ropensci/smapr/commit/dbf3db6735300a516464022ab12ca23000bcfcad).
I'd love to add @ldecicco-USGS and @marcosci as reviewers if they are comfortable with it.
I'd also be happy to write a blog post about smapr, and I'll take a look at the onboarding section of the gitbook!
Excellent. @stefaniebutland will be in touch about the blog schedule, in the meantime instructions for putting together post are here. https://github.com/ropensci/roweb2/blob/master/readme.md. I'm closing the issue but feel free to carry on that conversation here.
I've made you admin of the repo over the the rOpenSci org. Your other collaborators have write permissions. Feel free to increase their permission levels if needed be.
Hello @mbjoseph. Glad to hear you're interested in contributing a post. Our hope is that it gets more eyes on your work!
This link will give you many examples of blog posts by authors of onboarded packages so you can get an idea of the style and length you prefer: https://ropensci.org/tags/review/. Technotes are here: https://ropensci.org/technotes/.
As Noam said, here are some technical and editorial guidelines for contributing a blog post: https://github.com/ropensci/roweb2#contributing-a-blog-post. We ask that you submit your draft post via pull request a week before the planned publication date so we can give you some feedback.
At this point I have open publication slots in September. Do you want to pick a date for draft submission?
Happy to answer any questions.
Heya @stefaniebutland - September sounds great! How about Sep 15 deadline to submit a draft?
draft Sep 15 sounds great. I'll mark my calendar to ping you here then. Don't hesitate to ask questions here, or on Slack now that you're there :-)
@mbjoseph Are you still good with a Sep 15 draft for publication 2018-09-25?
@stefaniebutland yes - thanks for checking!
Summary
The smapr package discovers, downloads, and extracts global soil moisture data from the NASA SMAP mission, producing
raster
objects from HDF5 files.URL for the package (the development repository, not a stylized html page): https://github.com/earthlab/smapr
Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):
[e.g., "data extraction, because the package parses a scientific data file format"]
data retrieval and munging - smapr finds and downloads data, and extracts gridded spatial from HDF5 files with the correct projection information.
Soil scientists, ecologists, and people working in the remote sensing domain could use the package to quantify changes in soil moisture, detect droughts, generate covariate layers for species distribution models, and validate other soil moisture data products.
There is another smapr package, but it does not support the breadth of SMAP data products that this package does, and development seems to have ceased ~3 years ago: https://github.com/strongh/smapr. This smapr package has more thorough documentation and tests, allows extraction of multiple soil moisture data products, and has been on CRAN since 2016.
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
[ x ] Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:[ x ] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
Anyone with experience in programmatically finding and getting data from a https server, and/or working with HDF5 raster data would be great - perhaps: