USGS-R / drb-inland-salinity-ml

Code repo for Delaware River Basin machine learning models that predict inland salinity.
Creative Commons Zero v1.0 Universal
3 stars 4 forks source link

Multiple issues with `p1_catchments_sf` #54

Closed msleckman closed 2 years ago

msleckman commented 2 years ago

Currently, there seems to be an issue in targets p1_catchments_sf which has more unique values compared to reaches.

tar_target(p1_catchments_sf,
    {st_read(dsn=p1_catchments_shp, layer="nhru",quiet=TRUE) %>% filter(hru_segment %in% p1_reaches_sf$subsegseg)})

where p1_catchments_shp = "1_fetch/out/GeospatialFabricFeatures_02.gdb" downloaded from https://www.sciencebase.gov/catalog/item/535eda80e4b08e65d60fc834 under file geodatabases

image
msleckman commented 2 years ago

@jds485 - are we okay using Geospatial Fabric NHD data 1.0 (linked above) ? It looks like their is a version that succeeds it, version 1.1 --> https://www.sciencebase.gov/catalog/item/5e29d1a0e4b0a79317cf7f63 Confirming as this could alter how we process the catchment polygons.

jsadler2 commented 2 years ago

@msleckman - I was getting mixed up here. It does make sense to have many more hru's compared to segments.Most of the segments have two hru's (one on either side). Most of the headwater catchments have only one.

msleckman commented 2 years ago

@jds485 - are we okay using Geospatial Fabric NHD data 1.0 (linked above) ? It looks like their is a version that succeeds it, version 1.1 --> https://www.sciencebase.gov/catalog/item/5e29d1a0e4b0a79317cf7f63 Confirming as this could alter how we process the catchment polygons.

@jds485 Took a loot at the differences between the gbd in GFv1.0 and GFv1.1.

1 - the polygons of the nhru in gfv1.0 are identical to gf1.1 so we can use either 2 - Less clear attribute match with GFv.1.1 sf object to the study stream reaches we are using. GFv1.0 attributes hru_segment_orig and hru_segment match the study stream sf object attributes subsegseg.

In all - suggest keeping with GF1.0

Cursory comparison via qgis: below are screenshots comparing the attributes of the same adjacent polygon in GFv1.1 (nhru_1_1) and GFv1.0 (nhru) at the studystream reach segID number 57_1:

image image

Reach attributes:

image
msleckman commented 2 years ago

Which spatial file type should we use? Is one preferred to another? Would that help with getting a correct join between reaches and catchments? (See colnames in screenshot below of nhru_02.gdb, nhru_02.shp, p1_reaches_sf (study_stream_reaches)

Looking at the GFv1.0 gdb vs. GFv1.1 shpfiles, the invalid geometries are the same in both. I thought possibly that this was mainly in gdb because I ran into geometries issues with gridmet with gdb and not with shp format. Likely a different bug.

Since gdb is used in the 456 river reaches we are using, we can stuck with gdb.

Will have to make sure geometries get fixed in the pipeline.

jds485 commented 2 years ago

Will have to make sure geometries get fixed in the pipeline.

I viewed the HRUs in QGIS and ran the Check validity function. Here is the result: image

Zooming in on the error segments, it looks like the PRMS HRUS were generated from a raster, so there are cases where a segment of the polygon is isolated from the rest of the polygon. image

I think these would have to be manually edited to remove the red vertex (or maybe there is a function for this?). The Fix geometries function made the geometries valid, but the isolation problem persists. I'd expect the isolation problem is the cause of error messages

msleckman commented 2 years ago

Super clear. Issue #60 created to fix geometries in drb catchments gdb (p1_catchments_shp).

lekoenig commented 2 years ago

@jds485 Do you know which HRU is represented in your invalid geometry example above (e.g. hru_id or hru_segment)?

jds485 commented 2 years ago

Here are the hru_id and hru_segment columns:

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

1112 | 0 -- | -- 1505 | 0 2898 | 0 4724 | 30 3576 | 35 3463 | 42 3231 | 59 3244 | 59 4661 | 62 3250 | 63 3249 | 64 3266 | 68 3345 | 69 2553 | 77 3921 | 131 3892 | 138 3880 | 143 3839 | 148 4549 | 152 4550 | 164 3785 | 166 3735 | 168 3938 | 202 3906 | 204 3893 | 206 3885 | 207 3664 | 220 4717 | 240 3508 | 241 4700 | 264 3562 | 275 3356 | 284 4668 | 329 3173 | 333 3221 | 341 4656 | 343 4669 | 355 3096 | 356 3097 | 372 4635 | 374 3063 | 395 3083 | 574 3053 | 582 3081 | 584 2964 | 590 2935 | 601 3037 | 603 3002 | 605 1810 | 767 2454 | 828 4572 | 843 3185 | 859 3179 | 867 3215 | 867 3225 | 868 3267 | 868 3213 | 879 3172 | 881 3163 | 884 3125 | 889 3129 | 895 3186 | 902 3087 | 904 4774 | 1058 4541 | 1253 2846 | 1263 2493 | 1510 2494 | 1513 2495 | 1519 2530 | 2030 4630 | 2082 2981 | 2084 4677 | 2108 4633 | 2125 2977 | 2129 2920 | 2137 2912 | 2139 4706 | 2206 3248 | 2242 3256 | 2250 2892 | 2751

lekoenig commented 2 years ago

I think we want to keep #60 open until we create a new target that contains the NHD catchment polygons for a subset of the segments that have been split (but that the current HRU's don't account for). I'm wondering if this issue can be closed though? @msleckman

msleckman commented 2 years ago

@lekoenig Do we have an issue to address the missing PRMS catchments (segments with no catchments)? this regards the solution discussed today about taking the hru catchments adjacent to the PRMS segments that do not have PRMS catchments.If we do not have an issue, can it be added in this issue thread?

The original bugs raised in this issue are resolved now - they were specific to the geometry of the target p1_catchments_sf, which we fix and validate with buffer. But since the issue title is broad enough, we can maybe keep it open until we address the missing catchments. Either option work.

lekoenig commented 2 years ago

Do we have an issue to address the missing PRMS catchments (segments with no catchments)?

@msleckman The remaining steps are addressed in another open issue (#60), so I suggest we close this one if that works for you.