ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

terrainr: Retrieve Data from the USGS National Map and Transform it for 3D Landscape Visualizations #416

Closed mikemahoney218 closed 3 years ago

mikemahoney218 commented 3 years ago

Submitting Author: Michael Mahoney (@mikemahoney218)
Repository: https://github.com/mikemahoney218/terrainr Version submitted: 0.2.1 Editor: @ldecicco-USGS Reviewer 1: @mikejohnson51 Reviewer 2: @sfoks Archive: TBD
Version accepted: Feb. 19, 2021


Package: terrainr
Type: Package
Title: Retrieve Data from the 'USGS' National Map and Transform it for '3D' Landscape Visualizations
Version: 0.2.0
Authors@R: 
    person(given = "Michael",
           family = "Mahoney",
           role = c("aut", "cre"),
           email = "mike.mahoney.218@gmail.com",
           comment = c(ORCID = "0000-0003-2402-304X"))
Description: Functions for the retrieval and manipulation of 'USGS' National Map 
    data ('<https://viewer.nationalmap.gov/services/>'), enabling users to 
    download elevation and image data for areas of interest. Functions are also 
    provided to transform these data sources into formats that can be used to 
    create 3D elevation models in the Unity rendering engine.
URL: https://mikemahoney218.github.io/terrainr/, https://github.com/mikemahoney218/terrainr
BugReports: https://github.com/mikemahoney218/terrainr/issues
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Imports: 
    base64enc,
    httr,
    raster,
    magick (>= 2.5.0),
    gdalUtils,
    methods,
    png,
    utils,
    gdalUtilities,
    sf,
    rlang
RoxygenNote: 7.1.1
Suggests: 
    testthat,
    covr,
    progressr,
    knitr,
    rmarkdown,
    progress,
    ggplot2,
    jpeg,
    tiff
Config/testthat/parallel: true
Config/testthat/edition: 3
VignetteBuilder: knitr

Scope

terrainr enables users to retrieve geospatial data from the USGS National Map family of APIs and process it into both generic rasters for use in analysis and a specialized format which may be imported into the Unity rendering engine.

The data retrieval functions within terrainr are designed to be useful to any researcher interested in geospatial data for areas within the United States, particularly digital elevation models and orthoimagery. The data processing aspect of terrainr is designed to enable more users to use game engines as GIS, producing high-resolution 3D models from spatial data.

I am only aware of two packages which mention the USGS National Map API:

The elevatr package also returns elevation data from the USGS Elevation Point Query Service and AWS Terrain Tiles, which are similar -- but not the same -- sources as used in this package.

This package provides access to 9 API endpoints which I do not believe are incorporated in any other R package. Transforming tiles to be imported into Unity as heightmaps depends on a newly-contributed feature to the magick package, which leads me to believe this package is the first to provide this functionality.

N/A

N/A

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [x] The package is novel and will be of interest to the broad readership of the journal. - [x] The manuscript describing the package is no longer than 3000 words. - [x] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

melvidoni commented 3 years ago

Hello @mikemahoney218, thanks for submitting your package. @ldecicco-USGS will be your Handling Editor. You will hear from her soon.

ldecicco-USGS commented 3 years ago

Great news, we have our first reviewer @mikejohnson51. I may or may not try to find another Mike be reviewer 2 😬

ldecicco-USGS commented 3 years ago

I had a few people respond that if I didn't find someone by the new year, they'd volunteer, so I expect a 2nd reviewer any day. In the meantime, I ran a check and goodpractice::gp(). The check was fine.

The gp() left this message:

Preparing: description
Preparing: lintr
Preparing: namespace
Preparing: rcmdcheck
── GP terrainr ──────────────────────────────────────────────────────

It is good practice to

  βœ– write unit tests for all functions, and all
    package code in general. 43% of code lines are covered by
    test cases.

    R/combine_overlays.R:119:NA
    R/georeference_overlay.R:76:NA
    R/get_bbox.R:48:NA
    R/get_bbox.R:49:NA
    R/get_tiles.R:133:NA
    ... and 458 more lines

  βœ– write short and simple functions. These
    functions have high cyclomatic complexity:get_tiles (55).
─────────────────────────────────────────────────────────────────────

Which is not bad! If you want to work on the code coverage, that's always good.

mikemahoney218 commented 3 years ago

Fantastic! Thank you so much for the update :)

One thing to mention -- a lot of the package tests are set to skip on CRAN, as they either take too long, require internet access, or require GDAL. They should still be able to run locally, however (and on CI) with Sys.setenv(NOT_CRAN='true') -- when I do so, I get the following output:

> Sys.setenv(NOT_CRAN='true'); goodpractice::gp()

Preparing: covr
Preparing: cyclocomp
<snip r cmd check>
Preparing: description
Preparing: lintr
Preparing: namespace
Preparing: rcmdcheck
── GP terrainr ──────────────────────────────────────────────────────
It is good practice to

  βœ– write unit tests for all functions, and all package code in general. 96% of code lines are covered by test cases.

    R/combine_overlays.R:119:NA
    R/get_bbox.R:48:NA
    R/get_bbox.R:49:NA
    R/hit_api.R:160:NA
    R/hit_api.R:161:NA
    ... and 26 more lines

  βœ– write short and simple functions. These functions have high cyclomatic complexity:get_tiles (55).
─────────────────────────────────────────────────────────────────────

Thanks again for the update! :smile:

ldecicco-USGS commented 3 years ago

Ah right! Thanks for the reminder, I just got a new computer and my old computer had NOT_CRAN=true set by default.

mikemahoney218 commented 3 years ago

Hi all! Just wanted to note here that I've changed the main branch from master to main in line with GitHub's guidance. There shouldn't be any code changes (well, a single call got restyled) or anything to interrupt review. Sorry if there are any issues associated with the change!

mikemahoney218 commented 3 years ago

I believe I've spoken too soon! It appears that the transportation API has moved servers, so version 0.2.0 no longer works for that endpoint. I've pushed a fix to the development branch (here's the relevant diff) -- it was only a two-line change to handle transportation in the same way as DEMs and orthoimagery.

I apologize for needing to make this update while the package is under review! The version originally submitted still exists as release v 0.2.0. Does it make sense to update the version under review to the version with this fix (would become version 0.2.1), or should I wait and roll this update in with any revisions requested after review?

ldecicco-USGS commented 3 years ago

I'd vote to update the version under review to 0.2.1 with the fix. If that causes any confusion, I'm sure we can deal with it.

mikemahoney218 commented 3 years ago

Great! I've cut a release of 0.2.1 and split it into a release branch, both of which should hopefully remain even with main until reviewer comments come in. All the changes from 0.2.0 should be documented on the release page and in NEWS.

Thanks!

ldecicco-USGS commented 3 years ago

We've got a second reviewer @sfoks πŸŽ‰

mikejohnson51 commented 3 years ago

Hi @mikemahoney218 @ldecicco-USGS ,

Please find my review of terrainr attached:

Overall this is a very nice package. I want to emphasize this as, by nature, the following comments are more critique then complimentary. The package functions well, is stable and installed without problems.

I did not get to test any of the Unity functionality since I don't have it. Other then that here are some key points:

Key Points:

Branding

Code Review:

Below are my notes as I went through the files and package:

Install

devtools::install_github("mikemahoney218/terrainr")
library(terrainr)

Great! πŸ‘

add_bbox_buffer.R

There are a lot of functions in this package that recreate sf functionality. These could be simplified making the code much easier to maintain. While the implementation of these methods is obviously well considered, it is fragile compared to the GDAL/GEOS/PROJ backend supporting sf. For this package to attract wide use I think the package specific classes/structures must be removed For example, add_bbox_buffer:

Could be simplified to something like:

library(sf)
bb = st_as_sfc(st_bbox(c(ymin = 44.17609, ymax = 44.04905,
                         xmin = -74.01188, xmax = -73.83493), crs = 4326))
tmp = st_transform(bb, 5070) # EPSG ensures meters
tmp = st_buffer(tmp, 10, 
                joinStyle = 'MITRE', 
                endCapStyle = "SQUARE", 
                mitreLimit = 2)

out = st_transform(tmp, 4326) # Put back to Lat Long

https://github.com/jhollist/elevatr/blob/4c45d548231400284f72b92b42dff6d9fd9a0928/R/get_elev_raster.R#L9

classes.R

I think your toolset has a lot of value outside the terrainr environment, but it becomes very limited with the imposed classes for things like coordinate sets and bounding boxes which already have well defined and common class structures that play nicely in the R spatial environment. I would like to see all the specific classes removed from the package or else a strong justification for what they add. I am aware this will take some considerable rewrite in the package but think it is well worth it.

get_bbox.R

Again sf::st_bbox does everything needed. No need to write unique methods to extract bbox's from common spatial classes. This removes your dependency of rlang too.

For example:

st_bbox(c(ymin = 44.04905, ymax = 44.17609,
                         xmin = -74.01188, xmax = -73.83493), crs = 4326)

st_bbox(raster::raster(ncol=100, nrow=100))

get_bbox_centroid

Same thing. st_centroid() does what you need. No need for specific classes and logic. For example:

get_bbox_centroid(list(
     c(lat = 44.04905, lng = -74.01188),
     c(lat = 44.17609, lng = -73.83493)))

# Equals 

bb = st_as_sfc(st_bbox(c(ymin = 44.04905, ymax = 44.17609,
                         xmin = -74.01188, xmax = -73.83493), crs = 4326))

st_centroid(bb)

get_tiles.R

hit_national_map_api.R

#Set UP
url = httr::parse_url("https://elevation.nationalmap.gov/arcgis/rest/services/3DEPElevation/ImageServer/exportImage")

query_arg <- list(
      bboxSR = 4326,
      imageSR = 4326,
      size = paste(8000, 8000, sep = ","),
      format = "tiff",
      pixelType = "F32",
      noDataInterpretation = "esriNoDataMatchAny",
      interpolation = "+RSP_BilinearInterpolation",
      f = "json"
    )

## 
first_corner <- bbox@bl
second_corner <- bbox@tr

bbox_arg <- list(bbox = paste(
    min(first_corner@lng, second_corner@lng),
    min(first_corner@lat, second_corner@lat),
    max(second_corner@lng, first_corner@lng),
    max(second_corner@lat, first_corner@lat),
    sep = ","
  ))

####======
#^all the above could be replaces with paste(st_bbox(...), collapse = ",")

You have a lot of logic wrapping your httr calls which is essentially:

res <- httr::GET(url, query = c(bbox_arg, query_arg))
content = httr::content(res, type = "application/json")

An identical, but way to build in some of that error catching and retrying would be to change it to:

res2 <- httr::RETRY("GET", url, 
                   query = c(bbox_arg, query_arg), 
                   times = 10, pause_cap = 10)

You could use this approach again to get the images and write them to disk, instead of httr::content(..., "raw"). For example:

tmpfile <-tempfile()
content <-httr::content(res2, type = "application/json")

httr::RETRY("GET", content$href, 
                   httr::progress(), 
                   httr::write_disk(tmpfile, overwrite = TRUE),
                   times = 10, pause_cap = 10)

raster::raster(tmpfile) %>% raster::plot()

merge_rasters.R

The GDAL stuff you have here is (reasonably!!) way more than needed. One big reason to avoid gdalUtils in a package is because it requires an instance of GDAL that is different than the version that comes with sf. For many users this can be an issue. Here is an brief solution to these issues for much less code and much more utility:

The example you have only gives one raster which I assume wouldn't need to be merged? This should probably be changed, but I pulled an example from one of your other functions:

simulated_data <- data.frame(
  id = seq(1, 100, 1),
  lat = runif(100, 44.04905, 44.17609),
  lng = runif(100, -74.01188, -73.83493)
)

bbox <- get_coord_bbox(lat = simulated_data$lat, lng = simulated_data$lng)
bbox <- add_bbox_buffer(bbox, 100)
img_files = get_tiles(bbox, tempfile())

So the goal is to get this to a single raster right? If so try:

target_prj = st_crs(5070)$proj4string
method = "near"
destfile1 = tempfile()

sf::gdal_utils(util = "warp", 
                 source = unlist(img_files), 
                 destination = destfile1,
                 options = c("-t_srs", as.character(target_prj),
                             "-r", method))

raster::raster(destfile1)

# class      : RasterLayer 
# dimensions : 16782, 14984, 251461488  (nrow, ncol, ncell)
# resolution : 1.151324, 1.151324  (x, y)
# extent     : 1736879, 1754131, 2540862, 2560184  (xmin, xmax, ymin, ymax)
# crs        : +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs 

The cool thing about above is that it is short-and-sweet, only relies on sf and it offers you a direct path to allow users to specify the output CRS and resampling method (exposing target_prj, method, destfile1 as function parameters.)

Another reason to keep your bbox objects in standard classes is that you can use then to quickly crop the merged raster in the same call by first writing the bbox object to a shp. So instead of the above you could do:

# From the same coordinates make a 'sf' bbox of your points:
bbox = st_as_sf(simulated_data, coords = c("lng", "lat"), crs = 4326) %>% 
  st_bbox() %>% 
  st_as_sfc() %>% 
  st_as_sf()

# Write temporary shp
tmp_shp = tempfile(fileext = ".shp")
write_sf(bbox, dsn = tmp_shp)

destfile2 = tempfile()

sf::gdal_utils(util = "warp", 
                 source = unlist(img_files), 
                 destination = destfile2,
                 options = c("-t_srs", as.character(target_prj),
                             "-r", method,
                             '-cutline', tmp_shp,
                             '-crop_to_cutline'))

raster::raster(destfile2)

# class      : RasterLayer 
# dimensions : 14095, 14505, 204447975  (nrow, ncol, ncell)
# resolution : 1.151359, 1.151313  (x, y)
# extent     : 1737200, 1753901, 2542188, 2558415  (xmin, xmax, ymin, ymax)
# crs        : +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs 
# source     : /private/var/folders/_d/jkzhpcss17v0sy0zjr7xhrxr0000gn/T/RtmpyEg8av/file220c7958114b 
# names      : file220c7958114b

raster_to_raw_tiles.R

Use sf::gdal_utils("translate", ...) here instead of gdalUtils/gdalUtilities for the same reasons as above.

It may be worth your time to look at this: https://github.com/ropensci/tiler

georeference

The given example does not run because the arguments are in the wrong order, also it doesn't seem to need alignment? The input and output are identical.

No need for TIFF package (I think) since raster reads tiffs already (raster will also read PNG and JPEG, just into RGBA bands)

utils.R

Check out the units package to handle all these unit conversions, but really, the unit given should match the CRS of the input data. Because of this, I might consider being more strict about the units you allow for a buffering call.

Notes:

Again, there's a lot of good stuff here, I hope this is useful, and feel free to email for any clarification/follow up.

Mike

ldecicco-USGS commented 3 years ago

Thanks so much @mikejohnson51 ! I heard back from @sfoks that the second review should be done sometime early next week. So @mikemahoney218 , feel free to start responding to this first review or wait for both to be submitted.

mikemahoney218 commented 3 years ago

Thank you Mike for the thorough review! Below I've attempted to go through your comments line-by-line; I've collapsed comments that were repeated into single items (mostly httr/sf related) and removed the complements (though they were much appreciated :) ) for length. All the changes below are now merged to main as version 0.3.0.

Key Points:

* I think there is some nice utility here but there is also some cross over with others packages like `FedData`, `elevatr`.

* Following this, other then elevation, this is not a "spatial data" retrieval package as it returns PNGs that are more similar to map tiles then to spatial data. Thus the functionality is also similar to `rosm`.

* I cannot think of many cases for wanting the PNG files in R. That **certainly** does not mean they are not out there but I think more clarity about what the package is returning (maptiles) and there use would help it gain traction.

I fully agree that the API code overlaps slightly with a few other packages (particularly the FedData package, which uses two of the same services). As far as the API wrapper code is concerned, I see the primary differences as being:

With that said, I think the main differentiator of terrainr is the workflow it introduces for creating 3D terrain tiles with overlays in the Unity engine. There's a lot of buzz in visualization research about using game engines for interactive simulations of landscape dynamics, which is why I originally started working on terrainr; the get_tiles %>% merge_raster %>% raster_to_raw_tiles flow reduces what used to be an intensely manual process into an all-but-automated method (see for instance this five minute tutorial that walks through manually downloading a heightmap, wrangling the data in GIMP, and then importing it -- without any guidance on adding overlays).

I've clarified that most endpoints are returned as tiles. I've also attempted to rework documentation to emphasize that the data manipulation and export functionality are at least co-equal goals to the retrieval functions; unfortunately, as these functions need data to manipulate, the data retrieval functions have to before the manipulation functions in any given document. With that said, there are still a number of use cases for these image tiles within R -- see for instance this example with rayshader. I imagine the new geom_spatial_rgb (discussed below) will also be helpful here.

* Use `httr::RETRY` to handle much of your error catching and `httr::write_disk()` to avoid having to read images in as raw bytes/bits.  This will save a large amount of code as well.

I've attempted to use httr::RETRY in the place of my manual backoff before, with no success (I removed my last attempt in October). I attempted again following this review (documented here), but some challenges with this API (sometimes returning failures with a 200 status, returning HTML in lieu of JSON, sometimes returning successes with a 400, and I'm sure other weird aberrant behaviors) can't be handled as well in httr::RETRY as in the manual code. The end result is various CI failures, and retrieving each tile takes notably longer than the current manual backoff approach. As a result, unfortunately I don't see a way to incorporate httr::RETRY at this time.

Following last month's challenge with the transportation endpoint starting to return raw PNGs instead of base64 encoded images, I've changed the way get_tiles works so that instead of having a list of base64 vs raw image endpoints, get_tiles will check the type of the returned data instead (first relevant commit, current version). As a result, I believe I need to load the tiles into memory at some point in order to detect if they need to be decoded. Because of this, I don't think httr::write_disk will save system resources at the moment.

* With all of these I think you can drop: `base64enc`,  `gdalUtils`, `methods`, `gdalUtilities`, and `rlang` from your dependencies and many of your function can be slimmed down and/or removed. The result (in my option) will be a more lightweight package with higher utility, and more easy integration in spatial workflows.

At the end of the day, I was able to drop methods, gdalUtilities, and rlang; gdalUtils is still in use in raster_to_raw_tiles for reasons discussed below and base64enc is required for decoding endpoints that don't return data as a raw vector.

Branding

* It's a weird point but one that I think is important. `terrainr` as a name sells your package short while also being misleading. Terrain is one of 9 endpoints offered but is also anomalous in that it is the only "spatial" data available in your package (in the traditional vector/raster sense).

* Arguably terrain is also the best served of the resources in the R ecosystem with `elevatr` and `FedData`. `Elevatr` on the surface is a better choice for getting elevation data simply because it offers multiple resolutions and the Mapzen server is quicker then the ArcGIS server.

* My expectation of the package coming in given the description of "retrieve geospatial data" was that for resources like the NHD, NHDHighRes, WBD, and transportation, I would be getting the vector data. Particularly since these are offered by the National Map. Therefore, it is very important to be upfront that the package is offering endpoints to (effectively) mosaicked web tiles (with one spatial data source).

* All of this is not to discount the utility of your package because I think it is one of the few to offer access to these resources which makes it very unique! But the functionality needs to be explicit because, as is, someone coming in with the expectation that they will be able to query spatial data will be disappointed, while those looking for map tile data will likely pass your package by.

This comment actually took me by surprise originally! It had never even occurred to me that someone would see the name terrainr and associate it with the elevation endpoints (though of course it makes sense now). The name harkens back to the original goal of the package: creating terrain tiles in Unity. As discussed in the Unity vignette, "terrain" is used as a very precise name for what's being created (physically-explicit heightmap tiles created from planar-interlaced raw image files) which is used throughout the Unity UI. As mentioned in the earlier comments, I've attempted to make this goal of the package more explicit in the documentation and DESCRIPTION file, as well as more explicitly expressing that data is obtained as tiles.

As a quick side note, I want to point out that get_tiles has an argument resolution which lets you set pixel side length in meters.

* One **huge** opportunity for your package, that I think you already have worked out, is how to integrated these resources with ggplot as a base-map. It is a constant frustration of mine (and many I know) to get nice base maps for ggplot.

* Yes, `tmap` has some utilities for this and `ggmap` is out there (I have always struggled with it), there is room here for growth. If you could add a `geom_terrainr()` that played nicely with `ggplot`, I think that would be immediately useful to a many people.

Added geom_spatial_rgb (relevant commit) for displaying RGB tiles.

Code Review:

add_bbox_buffer.R

I've adopted your suggestion about how to shift these to sf. The older behavior is preserved in order to manipulate latitude/longitude data (but still uses units and sf as suggested). Relevant commit.

classes.R

get_bbox.R

get_bbox_centroid

These have been removed from all user-facing code. The classes still exist for communication between package functions, but are no longer the only accepted inputs for functions and are not returned by any function. sf or Raster objects can be used in their place across the board. Relevant commit.

get_tiles.R

* This function, even for the example, takes a while to run, it would be nice to have some messaging along the way to give confidence it is going if `progressr` is not installed. Consider using `httr::progress()` in `hit_national_map_api`. More on that below.

I personally am very persuaded by progressr's motto:

The developer is responsible for providing progress updates but it's only the end user who decides if, when, and how progress should be presented. No exceptions will be allowed.

Letting users control how and why they're updated is the key goal of using progressr, so I'm reluctant to add always-on progress bars (which would trigger three separate times per tile downloaded, I believe) via httr at the moment.

* Its nice to be able to provide a `tmpfile` prefix, but would be equally nice to be able to set where the file is downloaded to (other then the tmp directory). This will come up again below with the GDAL discussion.

Try setting output_prefix to any valid path on your system for this. get_tiles will append a string in the format servicename_xtile_ytile.tif to output_prefix for each tile downloaded.

* find a way to avoid the `3` in the beginning of your list name for 3DEP, maybe elevation3DEP?

I changed the behavior so that now get_tiles returns a list whose names match the names provided to service; this means that if you provide services = "elevation" the output list will now be named elevation. If a user specifies 3DEPElevation, I think it still makes sense to name that element of the list 3DEPElevation in order to have consistent behavior across endpoints; I do wish the official service name was something else! Relevant commit.

merge_rasters.R

I've adopted your suggested sf/gdal_warp code entirely, and cannot believe how much faster and cleaner it runs now. Thanks in particular for this suggestion! Relevant commit.

raster_to_raw_tiles.R

Use sf::gdal_utils("translate", ...) here instead of gdalUtils/gdalUtilities for the same reasons as above.

There's a comment in this file from about five months ago that I left as a warning to my future self:

https://github.com/mikemahoney218/terrainr/blob/851bcba8819e28084548edb84835bc71ed74bb03/R/raster_to_raw_tiles.R#L78

Unfortunately, this is still true. gdalUtilities is a simple wrapper around sf::gdal_utils (with command formatting more similar to command-line gdal), and so using sf::gdal_utils through the wrapper still causes my R session to hard crash for reasons I'm entirely unaware of. Unfortunately, that has left me dragging in gdalUtils for the time being.

It may be worth your time to look at this: https://github.com/ropensci/tiler

Unfortunately, I'm yet to get this package to work on my laptop due to an issue with Python package dependencies.

georeference

The given example does not run because the arguments are in the wrong order, also it doesn't seem to need alignment? The input and output are identical.

Serves me right for not naming my arguments! Addressed both issues here.

No need for TIFF package (I think) since raster reads tiffs already (raster will also read PNG and JPEG, just into RGBA bands)

I had to do a little bit of investigation into why this function works this way, myself! It turns out that while raster::brick is perfectly happy to read in any type of file, it only correctly detects the scale of the bands if the file is entirely read into memory (and assumes 255/8-bit images if not). Given that georeference_overlay has to handle images with unknown scales (for instance, one user of this package edits the tiles in Photoshop) and even the default case (tiles returned by the National Map and vector_to_overlay) is scaled [0, 1], this function has to load the images into memory before creating RasterBricks to avoid all outputs being either black or white squares. I haven't fully investigated if this is true with TIFF images as well (verified for JPEG and PNG), but I believe the current approach is the simplest way to handle varied file types.

utils.R

Check out the units package to handle all these unit conversions, but really, the unit given should match the CRS of the input data. Because of this, I might consider being more strict about the units you allow for a buffering call.

I've swapped over to units for this across the board (except in the case of radians -> degree conversions, where I still use my helper functions -- which are no longer exported -- to make the code a bit shorter/more legible). As someone working in a field (forestry) that still relies upon inches and chains as standard units, I'm reluctant to force people to SI standards.

Notes:

* Readme - you have an 'elev' hanging at the end of the 2nd 'plot' line

Doh! Fixed, thank you.

sfoks commented 3 years ago

Package Review


Provided Review Template

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 10 hours


Additional reviewer questions from guide


Review Comments

ldecicco-USGS commented 3 years ago

Thank you so much @sfoks and @mikejohnson51 !!!!

mikemahoney218 commented 3 years ago

Thank you so much @sfoks ! I've responded to your comments below.

* Within the unity_instructions.Rmd, I think  `merged_outputs <- merge_rasters(raw_tiles$`3DEPElevation`, tempfile(fileext = ".tif"), .$USGSNAIPPlus, tempfile(fileext = ".tif"))` should be changed to: `merged_outputs <- merge_rasters(raw_tiles$`3DEPElevation`, tempfile(fileext = ".tif"), raw_tiles$USGSNAIPPlus, tempfile(fileext = ".tif"))`

Fixed, thank you! Relevant commit.

* In general, I received a lot of warnings with the unity_instructions vignette, several related to the `merge_rasters` function. ex: `Warning 6: gdalbuildvrt does not support heterogeneous band numbers: expected 3, got 4. Skipping /var/folders/zd/0gcdnc8x21177wtw966dqktm0000gn/T//Rtmp1XicTF/file2176d7036b2.tif`; however this didn't seem to influence the generation of output.

This should be resolved thanks to changes made for Mike's review (relevant commit). I've validated that this "works on my machine" without warnings thanks to the new and improved merge_rasters and haven't seen any warnings in CI output yet.

This used to happen when the National Map returned some image tiles with alpha bands and some without, due to a limitation in the workflow we were using; however, whether or not a given tile will have an alpha band seems to change over time (so that running the same query a day later might resolve the issue). I think this resulted in a tile not getting merged into your full raster, resulting in the CPL_gdaltranslate error mentioned below; if I'm right, then I think the commit above solves both issues!

* Other warnings were received upon performing `raster_to_raw_tiles` function: ex `Warning 1: -srcwin 0 8194 4097 4097 falls partially outside raster extent. Going on however. ` I believe you explain that this is due to National Map API issues in trying to download a certain geographic extent but wanted to make sure you were aware of it. I also believe these warnings (also associated with the `raster_to_raw_tiles` function) are related to the same National Map API download issues or perhaps my instance of GDAL?
  `Warning messages: 1: In CPL_gdaltranslate(source, destination, options, oo,  ... : GDAL Message 1: for band 1, nodata value has been clamped to 0, the original value being out of range. 2: In CPL_gdaltranslate(source, destination, options, oo,  ... : GDAL Message 1: for band 2, nodata value has been clamped to 0, the original value being out of range. 3: In CPL_gdaltranslate(source, destination, options, oo,  ... : GDAL Message 1: for band 3, nodata value has been clamped to 0, the original value being out of range. 4: In CPL_gdaltranslate(source, destination, options, oo,  ... : GDAL Message 1: for band 4, nodata value has been clamped to 0, the original value being out of range.`

* Give these warning examples listed above, the raw, xml, and png files exported to my directory correctly.

Thank you for flagging!

Warning 1: -srcwin 0 8194 4097 4097 falls partially outside raster extent. Going on however:

You're exactly right, this warning is related to the National Map endpoint not exactly respecting the requested bounding boxes. I've made this explicit in the vignette (relevant commit), but to explain here as well:

Since Unity expects to import heightmaps as squares with side lengths that are powers of 2, this function will create tiles that are the right size for Unity; in order to make things easier for the user, those tiles will all be the same size (so users don't need to change sizes for each tile they import). That means that if the input raster isn't exactly divisible by side_length, this function will create tiles with no data past a certain point ("falls partially outside raster extent").

I've left this message as standard output (rather than an official R warning) because I believe it's useful for users to know to expect some no-data regions at the edges of their terrain tiles, but it's standard function behavior (so shouldn't interrupt users who have set options(warn=2) or similar). I'm happy to change this behavior (either provide a message from within terrainr, rather than messages from GDAL, or raise an actual warning) if you think something else makes more sense!

In CPL_gdaltranslate(source, destination, options, oo, ... : GDAL Message 1: for band 1, nodata value has been clamped to 0, the original value being out of range.:

I believe this is about the issue with mixed 3 & 4 band rasters as mentioned above, and should hopefully be fixed given the new merge_rasters code.

* I downloaded Unity to test terrainr output, but could not get my tiles into the software! This is an issue with me never have used Unity, so definitely a user-error. I may try this again at a later date since it seems like such a cool way to visualize this data.

If you're willing, I'd love to hear what went wrong! I would love to make the unity_instructions vignette as comprehensive as possible for future users.

* I think it would be great to present additional examples of data pulls and use in Unity, but of course this is not necessary for package publication. It appears there is a lot that this package can do and providing more examples of functionality to novice users like me would be awesome.

Thank you! I fully hope and intend to keep adding examples to the documentation moving forward.

* Overall I think this package is well put together, thanks for allowing me to review!

Thank you so much for reviewing!

ldecicco-USGS commented 3 years ago

@mikemahoney218 thanks for detailed response!

@sfoks and @mikejohnson51 Could you please indicate whether you are happy with the authors' response? Template https://devguide.ropensci.org/approval2template.html

sfoks commented 3 years ago

Reviewer Response

Final approval (post-review)

My earlier comment: I downloaded Unity to test terrainr output, but could not get my tiles into the software! This is an issue with me never have used Unity, so definitely a user-error. I may try this again at a later date since it seems like such a cool way to visualize this data. Author response: If you're willing, I'd love to hear what went wrong! I would love to make the unity_instructions vignette as comprehensive as possible for future users.

Estimated hours spent reviewing: 11 hours total

mikejohnson51 commented 3 years ago

Reviewer Response

Kudos, great work @mikemahoney218 I am happy both as a reviewer and as a future user. The new ggplot tools seem great and all of the answers were well articulated with strong reasoning for keeping/rejecting changes.

Figuring out the remaining gdal issue would be nice and, if you can provide a reprex, I am happy to take a look. But that is not needed for package acceptance.

Final approval (post-review)

Estimated hours spent reviewing: 6

ldecicco-USGS commented 3 years ago

Approved! Thanks @mikemahoney218 for submitting and @sfoks and @mikejohnson51 for your reviews!

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

mikemahoney218 commented 3 years ago

Thank you so much all!

I'd like to include both of you as reviewers ("rev" role) in the DESCRIPTION; I see that @sfoks has specifically agreed to this, but would that be alright with you, @mikejohnson51 ?

mikemahoney218 commented 3 years ago

Also, @ldecicco-USGS , I received this error when trying to transfer the repo: Screenshot from 2021-02-19 16-33-42

Do you have any idea what might be going wrong here? Thank you!

ldecicco-USGS commented 3 years ago

Ha sure enough... I think you should have an invitation now. Give it another try...and if THAT doesn't work let me know (this is always the step I manage to mess up).

mikemahoney218 commented 3 years ago

Second time's the charm! Transferred the repo over: https://github.com/ropensci/terrainr Thank you!

mikemahoney218 commented 3 years ago

Alright, I believe I've done everything on that checklist now @ldecicco-USGS ! I haven't seen the docs build yet on docs.ropensci -- is there anything I need to do here other than wait for the repo to get picked up by CI?

I would love to submit a blog for the site! I'll take a look at the guide over the weekend and would love to talk with @stefaniebutland about timing.

Thanks again, everyone!

mikejohnson51 commented 3 years ago

@mikemahoney218 thanks for the offer, that would great. Congrats on your package!

stefaniebutland commented 3 years ago

Congratulations on passing peer review @mikemahoney218.

When you're ready, please submit a draft post by pull request, following instructions in https://blogguide.ropensci.org/. Then my colleague @steffilazerte will review it and suggest a publication date. Followup questions can move to Slack. Cheers!