get_split_catchment not delineating entire catchment area above pour point

spencer-tassone commented 1 month ago

I am trying to delineate a catchment basin from a provided pour point using the function get_split_catchment(). The function correctly delineates the COMID associated with the pour point however, it seems to stop running when it runs into the next upstream COMID. The ideal output would be a delineated basin for the entire upstream area of the provided pour point.

Example

library(tidyverse)
library(nhdplusTools)
library(sf)

# 1. Define the starting point (pour point) ----
start_point <- st_sfc(st_point(c(-78.3361152, 38.9767759)), crs = 4269)

# Snap point to the nearest NHDPlus flowline
trace <- get_raindrop_trace(start_point)

# Use the intersection point as the pour point
snap_point <- st_sfc(st_point(trace$intersection_point[[1]]),
                         crs = 4326)

# 2. Retrieve NHDPlus COMID for the snapped point ----
start_comid <- discover_nhdplus_id(snap_point)

# 3. Navigate and extract upstream tributary flowlines ----
flowline <- navigate_nldi(list(featureSource = "comid", 
                               featureID = start_comid), 
                          mode = "upstreamTributaries", 
                          distance_km = 1000)

# 4. Download and subset NHDPlus data for the flowlines ----
subset_file <- tempfile(fileext = ".gpkg")
subset <- subset_nhdplus(comids = as.integer(flowline$UT$nhdplus_comid),
                         output_file = subset_file,
                         nhdplus_data = "download", 
                         flowline_only = FALSE,
                         return_data = TRUE, overwrite = TRUE)
#> All intersections performed in latitude/longitude.
#> Reading NHDFlowline_Network
#> Writing NHDFlowline_Network
#> Reading CatchmentSP
#> Writing CatchmentSP
flowline <- subset$NHDFlowline_Network

# Define the NLDI feature list for the starting COMID ----
nldi_nwis <- list(featureSource = "comid", featureID = start_comid)

# 5. Delineate the basin for the snap point ----
basin <- get_split_catchment(snap_point, upstream = TRUE)
basin <- basin[2,3]

# 6. Visualize watershed ----
ggplot() +
  geom_sf(data = flowline, color = "blue", size = 1) +
  geom_sf(data = basin, fill = NA, color = "black", size = 1) +
  geom_sf(data = start_point, color = "red", size = 3, shape = 21, fill = "red") +
  theme_bw() +
  theme(axis.text = element_text(size = 14, color = 'black'))

^{Created on 2024-07-31 with reprex v2.1.1}

sessionInfo()
#> R version 4.4.1 (2024-06-14 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22631)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/New_York
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.36     fastmap_1.2.0     xfun_0.45         glue_1.7.0       
#>  [5] knitr_1.48        htmltools_0.5.8.1 rmarkdown_2.27    lifecycle_1.0.4  
#>  [9] cli_3.6.3         reprex_2.1.1      withr_3.0.0       compiler_4.4.1   
#> [13] rstudioapi_0.16.0 tools_4.4.1       evaluate_0.24.0   yaml_2.3.9       
#> [17] rlang_1.1.4       fs_1.6.4

^{Created on 2024-07-31 with reprex v2.1.1}

dblodgett-usgs commented 1 month ago

Try calling https://doi-usgs.github.io/nhdplusTools/reference/get_raindrop_trace.html first and using the returned intersection point as your snap point.

I updated documentation in a recent PR but need to role it out to the pkgdown site and CRAN still.

spencer-tassone commented 1 month ago

I believe that is what I am doing in part 1 of my code above

# Snap point to the nearest NHDPlus flowline
trace <- get_raindrop_trace(start_point)

# Use the intersection point as the pour point 
snap_point <- st_sfc(st_point(trace$intersection_point[[1]]),
                         crs = 4326)

dblodgett-usgs commented 1 month ago

Oh, sorry, I missed that. Looking now and this seems odd. A quick look at what's going on has me wondering if this is something strange with the web service we call for these split catchments.

Does this work anywhere else you have tried? Testing the function demo it is working as expected. I'll try and spend a bit more time looking at this later on.

spencer-tassone commented 1 month ago

It seems to work ok at an upstream pour point except now the delineated basin includes some discontinuous polygons/lines in the area outside of the basin (see the upper-right area of figure below).

library(tidyverse)
library(nhdplusTools)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE

# 1. Define the starting point (pour point) ----
start_point <- st_sfc(st_point(c(-78.6389042, 38.74567048)), crs = 4326)

# Snap point to the nearest NHDPlus flowline
trace <- get_raindrop_trace(start_point)

# Use the intersection point as the pour point
snap_point <- st_sfc(st_point(trace$intersection_point[[1]]),
                         crs = 4326)

# 2. Retrieve NHDPlus COMID for the snapped point ----
start_comid <- discover_nhdplus_id(snap_point)

# 3. Navigate and extract upstream tributary flowlines ----
flowline <- navigate_nldi(list(featureSource = "comid", 
                               featureID = start_comid), 
                          mode = "upstreamTributaries", 
                          distance_km = 1000)

# 4. Download and subset NHDPlus data for the flowlines ----
subset_file <- tempfile(fileext = ".gpkg")
subset <- subset_nhdplus(comids = as.integer(flowline$UT$nhdplus_comid),
                         output_file = subset_file,
                         nhdplus_data = "download", 
                         flowline_only = FALSE,
                         return_data = TRUE, overwrite = TRUE)
#> All intersections performed in latitude/longitude.
#> Reading NHDFlowline_Network
#> Writing NHDFlowline_Network
#> Reading CatchmentSP
#> Writing CatchmentSP
flowline <- subset$NHDFlowline_Network

# Define the NLDI feature list for the starting COMID ----
nldi_nwis <- list(featureSource = "comid", featureID = start_comid)

# 5. Delineate the basin for the snap point ----
basin <- get_split_catchment(snap_point, upstream = TRUE)
basin <- basin[2,3]

# 6. Visualize watershed ----
ggplot() +
  geom_sf(data = flowline, color = "blue", size = 1) +
  geom_sf(data = basin, fill = NA, color = "red", size = 1) +
  geom_sf(data = start_point, color = "green", size = 3, shape = 21, fill = "green") +
  theme_bw() +
  theme(axis.text = element_text(size = 14, color = 'black'))

^{Created on 2024-07-31 with reprex v2.1.1}

dblodgett-usgs commented 1 month ago

Hi -- ok, I chatted with the developer of the service and he confirmed that this is a known issue that happens in rare cases. The issue is that the point you got on the flowline doesn't precisely coincide with a raster cell along the river pathway.

Unfortunately, the solution is to try a slightly different lat/lon position so it hits a cell that is on the river.

dblodgett-usgs commented 1 month ago

Yeah, so if I move your point to start_point <- st_sfc(st_point(c(-78.338, 38.979)), crs = 4269)

I get

spencer-tassone commented 1 month ago

Yes - I get a similar result when using a slightly different lat/long. However, I am still seeing the random geometries near the mouth of the basin (upper-right). Should I close out this question and open a new one to address this new issue? Rplot

dblodgett-usgs commented 1 month ago

No... that issue is also known. You need to explode the geometry and only keep the largest one if you want to fix the issue. The "clean_geometry" function in https://github.com/DOI-USGS/hyRefactor/blob/main/R/catchment_geometry.R will do it or you can craft your own pretty easily based on code in that function.

I'm talking to Anders who developed https://code.usgs.gov/wma/nhgf/toolsteam/nldi-flowtools (the code for the service in question) about fixing it on the service side.

dblodgett-usgs commented 1 month ago

OK, tracked this down from an email.

basin_id <- paste0('USGS-', c('01435000','01434498', '01434025','01434021', '01434017'))

nhd <- lapply(basin_id, function(x){

  get_nldi_basin(list(featureSource = "nwissite",

                                    featureID = x), simplify = FALSE, split = TRUE) %>%

    ms_explode() %>%

    rbind() %>%

    dplyr::mutate(Area = st_area(geometry) ) %>%

    dplyr::filter(Area == max(Area))

})

basin  <- do.call(rbind, nhd)

gage <- lapply(basin_id, function(x){

  get_nldi_feature(list(featureSource = "nwissite",

                                    featureID = x))

})

gage <-  do.call(rbind, gage)

plot(st_geometry(basin))

plot(st_geometry(gage), col = "red", add = TRUE, pch = 16)

dblodgett-usgs commented 1 month ago

Leave this issue open. I'll add a little clean up code to the service call to try and take care of the splinters.

dblodgett-usgs commented 1 month ago

I incorporated a little clean up code and now see:

library(nhdplusTools)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE


start_point <- st_sfc(st_point(c(-78.338, 38.979)), crs = 4269)

trace <- get_raindrop_trace(start_point)

snap_point <- st_sfc(st_point(trace$intersection_point[[1]]),
                     crs = 4326)

basin <- get_split_catchment(snap_point, upstream = TRUE)

plot(basin$geometry[2])

^{Created on 2024-08-01 with reprex v2.1.1}

DOI-USGS / nhdplusTools

get_split_catchment not delineating entire catchment area above pour point #406

Example