Closed Vinterwoo closed 3 years ago
Hi @Vinterwoo. Thanks for the question. To be honest, I've not spent enough time with waterbodies and artificial paths to claim to be an expert here.
I think you will need to do a minor intermediate step between getting your starting ID and navigating upstream. If your starting ID is an artificial path COMID, you need to make sure it is the outlet artificial path.
I can think of a few ways you might go about this but am guessing they are each some sort of hack that could be accomplished more directly.
On page 159 of the NHDPlusV2 manual, I see reference to an attribute: WBAreaComI
described as:
ComID of the NHD polygonal water feature through which a NHD “Artificial Path” flowline flows
So maybe if you take the COMID you get, find all the artifical path flowlines with the same WBAreaComI
, and navigate upstream from the one with the smallest Hydroseq
?
Here’s the problem lake and point…
In this case, the WBAreaComI is “21160241” which you can find with a call to sf::st_join
with your point and the NHDWaterbody layer.
Here’s some code to show how to find that outlet artificial path.
library(nhdplusTools)
library(sf)
library(dplyr)
nhdplus_path("NHDPlusV21_National_Seamless.gdb/")
staged <- stage_national_data()
flines <- readRDS(staged$attributes)
wb <- read_sf(nhdplus_path(), "NHDWaterbody")
start_point <- st_sf(id = 1, geom = st_sfc(st_point(c(-97.82922, 37.72797)), crs = 4269))
start_point <- st_join(start_point, wb)
wbcomid <- start_point$COMID
wb_flines <- filter(flines, .data$WBAREACOMI %in% wbcomid) %>%
group_by(.data$WBAREACOMI) %>%
filter(.data$Hydroseq == min(.data$Hydroseq)) %>%
ungroup()
The returnedwb_flines
has one row containing the flowline you are after. The group_by() here will get you one per waterbody — if you need it to be one per point, you would need to do a little more fiddling but I’ll leave that up to you.
Good luck!
Hello Dave,
Thanks for the reply- that is exactly the kind of help I was looking for.
Your code worked well for my example, and I can plan to apply your suggestion to a few more lakes and see if I run into any more problems. Once I've tested out the code on a few more lakes, I can plan to close out the git issue.
One quick follow up, I've been using "NHDPlusV21_National_Seameless_Flattened_Lower48.gbd" file to read in the NHD plus data. I follow the suggestion to use the "staged_national_data approach" as outlined in your vignette. But one of my main choke points in running this kind of analysis on multiple lakes is the long delays associated with working with this large dataset.
The vignette includes a section on subsetting the dataset using "subset_nhdplus" command, which allows me to subset the database by COMID. But I run into problems if I try to subset the database with a large number of COMIDs (for examples, sub setting the DB for the state of Kansas has 101579 COMIDs). If I run subset_nhdplus with all the COMIDs for Kansas, R seems to hang up (or is taking greater than 20 min to complete).
Is there a better/ more efficient way to subset the NHDplus database by State then my approach?
Many thanks,
Vince
On Sun, Jan 5, 2020 at 11:01 PM David Blodgett notifications@github.com wrote: Hi @Vinterwoo. Thanks for the question. To be honest, I've not spent enough time with waterbodies and artificial paths to claim to be an expert here.
I think you will need to do a minor intermediate step between getting your starting ID and navigating upstream. If your starting ID is an artificial path COMID, you need to make sure it is the outlet artificial path.
I can think of a few ways you might go about this but am guessing they are each some sort of hack that could be accomplished more directly.
On page 159 of the NHDPlusV2 manual, I see reference to an attribute: WBAreaComI described as:
ComID of the NHD polygonal water feature through which a NHD “Artificial Path” flowline flows
So maybe if you take the COMID you get, find all the artifical path flowlines with the same WBAreaComI, and navigate upstream from the one with the smallest Hydroseq?
Here’s the problem lake and point…
In this case, the WBAreaComI is “21160241” which you can find with a call to sf::st_join with your point and the NHDWaterbody layer.
Here’s some code to show how to find that outlet artificial path.
library(nhdplusTools)
library(sf)
library(dplyr)
nhdplus_path("NHDPlusV21_National_Seamless.gdb/")
staged <- stage_national_data()
flines <- readRDS(staged$attributes)
wb <- read_sf(nhdplus_path(), "NHDWaterbody")
start_point <- st_sf(id = 1, geom = st_sfc(st_point(c(-97.82922, 37.72797)), crs = 4269))
start_point <- st_join(start_point, wb)
wbcomid <- start_point$COMID
wb_flines <- filter(flines, .data$WBAREACOMI %in% wbcomid) %>%
group_by(.data$WBAREACOMI) %>%
filter(.data$Hydroseq == min(.data$Hydroseq)) %>%
ungroup() The returned wb_flines has one row containing the flowline you are after. The group_by() here will get you one per waterbody — if you need it to be one per point, you would need to do a little more fiddling but I’ll leave that up to you.
Good luck!
Dave — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
What command are you using when calling "subset_nhdplus()"?
Apologies- I didn't realize my email was going straight to git in my last message. Since it's a different question, I'll move the above comment to a different issue.
To continue with my original question. I've been running the above code you suggested with different lakes and getting some odd behaviors. Here are a few instances where I was not able to get a full watershed:
Lat = 39.70083 Lon = -96.15000 -> Not able to find Lake Lat = 39.03918 Lon = -95.69084 -> Not able to find Lake Lat = 38.37081 Lon = -98.79521 -> Lake found, but no matching water body area ID Lat = 39.25423 Lon = -96.58229 -> Lake found, but no matching water body area ID Lat = 37.65870 Lon = -96.09253 -> Only portion of water shed included Lat = 38.50599 Lon = -95.70823 -> Only portion of water shed included Lat = 39.80482 Lon = -99.93915 -> Only portion of water shed included
Note that many of the gps points I'm starting with are near shore. I do use a incremental buffering up to 200 meters when I'm attempting to match gps points to my lake just in case the gps points are just outside of the lake polygon.
I could speculate here but would need to pop up a map of each point and the layers involved to diagnose. My recommendation would be to try that to see why the st_join isn't working. There are a few cases I could imagine would cause issues that you'll need to handle.
I'd be game to write a function that could take a point and return the nearest on network waterbody outlet if you can provide me a list of points. That would fit with this issue pretty well. https://github.com/USGS-R/nhdplusTools/issues/19
Is this what you need? The csv contains Lat and Lon points for a group of lakes in Kansas. All of the points are near the shore, but are located within small to medium sized lakes. I also made a note of how successful I was in calculating a full watershed for each lake.
Yes, thanks. Not going to get to this immediately, but I'll leave this issue open and come back to it as time allows.
With @mhweber's contribution of get_wb_outlet()
we have a path to doing this.
Here's a reprex that demonstrates where we are now.
library(nhdplusTools)
#> USGS Support Package: https://owi.usgs.gov/R/packages.html#support
library(sf)
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
start_point <- st_sf(id = 1,
geom = st_sfc(st_point(c(-97.82922,
37.72797)),
crs = 4269))
sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
domain <- st_buffer(st_as_sfc(st_bbox(start_point)), .1)
#> Warning in st_buffer.sfc(st_as_sfc(st_bbox(start_point)), 0.1): st_buffer does
#> not correctly buffer longitude/latitude data
#> dist is assumed to be in decimal degrees (arc_degrees).
# grab some sample data and plot the domain
nhd <- plot_nhdplus(bbox = st_bbox(domain))
# Also plot the start point.
plot(st_geometry(st_transform(start_point, 3857)),
add = TRUE, col = "red", cex = 2)
# Just do a intersects join here to get the COMID we need
start_point <- st_join(start_point, nhd$network_wtbd)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
# This is the new function to give you the waterbody outlet.
wb_out <- get_wb_outlet(lake_COMID = start_point$COMID,
network = nhd$flowline)
# Now we can plot up the whole watershed contributing to this
# lake outlet.
plot_nhdplus(outlets = list(wb_out$comid), streamorder = 2)
# Zooming back to tbe bbox domain, we can look at the
# outlet vs our original point
plot_nhdplus(bbox = st_bbox(domain))
plot(st_geometry(st_transform(wb_out, 3857)), col = "red", lwd = 4, add = TRUE)
plot(st_geometry(st_transform(start_point, 3857)),
add = TRUE, col = "red", cex = 2)
Created on 2021-10-28 by the reprex package (v2.0.0)
I'm hoping to calculate watersheds for individual water bodies. I've had some success following this tutorial:
https://usgs-r.github.io/nhdplusTools/articles/nhdplusTools.html
Using the "subset$CatchmentSP" info gives me the upstream catchment for that spot. But I'm running into problems with other gps point For Instance:
The catchment from the above point gives me only a very small aspect of the lake watershed, but I'm looking for the catchment of the entire lake. I'm assuming the issue has something to do with the fact that the point is not near the pour point of the lake, or near a virtual flow line off the main flow line of the lake.
Using nhdplusTools, is there any way to calculate the catchment for an entire water body? Either by using a polygon for the lake, or linking to the ID of the lake?
If not, and I have to work with single points to calculate a catchment as above, is there a way find the furthest downstream virtual flow line within the lake boundary that would return a catchment for the entire lake in question?