hyriver / pynhd

A part of HyRiver software stack that provides access to NHD+ V2 data through NLDI and WaterData web services
https://docs.hyriver.io
Other
33 stars 8 forks source link

Querying gages with WaterData('gagesii') within geometry of huc02 #67

Closed msleckman closed 1 year ago

msleckman commented 1 year ago

What happened?

I've pulled huc02 and huc4 from the USGS WBD using pynhd's WaterData() function.

I am trying to pull gages ('gagesii') within those boundary polygons (using WaterData('gagesii').by_geom()). I got an error when attempting to pull gages for huc2. Gages returned successfully for the huc04 polygons.

import geopandas as gpd
import pynhd

polygon1_huc2 = gpd.read_file('tmp_huc2_polygon_1.gpkg')

print(f'Is polygon1_huc2 geometry valid:  {polygon1_huc2.is_valid}')

WaterData('gagesii').bygeom(polygon1_huc2.reset_index().geometry[0])

image

----


Examples gpkgs: tmp_huc_polygons.zip

What did you expect to happen?

I tested this with a huc 4 polygon and I did not run into this error. This is the expected output.

import geopandas as gpd

polygon1_huc4 = gpd.read_file('tmp_huc4_polygon_1.gpkg')

print(f'Is polygon1_huc4 geometry valid:  {polygon1_huc4.is_valid}')

WaterData('gagesii').bygeom(polygon1_huc4.reset_index().geometry[0])

image

Minimal Complete Verifiable Example

import geopandas as gpd
import pynhd

polygon1_huc4 = gpd.read_file('tmp_huc4_polygon_1.gpkg')
print(f'Is polygon1_huc4 geometry valid:  {polygon1_huc4.is_valid}')
gages_polygon1_huc4 = WaterData('gagesii').bygeom(polygon1_huc4.reset_index().geometry[0])
print(gages_polygon1_huc4.head(3))

print('----'*60)

polygon2_huc4 = gpd.read_file('tmp_huc4_polygon_2.gpkg')
print(f'Is polygon1_huc4 geometry valid:  {polygon2_huc4.is_valid}')
gages_polygon2_huc4 = WaterData('gagesii').bygeom(polygon2_huc4.reset_index().geometry[0])
print(gages_polygon2_huc4.head(3))

print('----'*60)

polygon1_huc2 = gpd.read_file('tmp_huc2_polygon_1.gpkg')
print(f'Is polygon1_huc2 geometry valid:  {polygon1_huc2.is_valid}')
gages_polygon1_huc2 = WaterData('gagesii').bygeom(polygon1_huc2.reset_index().geometry[0])
print(gages_polygon1_huc2.head(3))

print('----'*60)

polygon2_huc2 = gpd.read_file('tmp_huc2_polygon_2.gpkg')
print(f'Is polygon2_huc2 geometry valid:  {polygon2_huc2.is_valid}')
gages_polygon2_huc2 = WaterData('gagesii').bygeom(polygon2_huc2.reset_index().geometry[0])
print(gages_polygon1_huc2.head(3))

MVCE confirmation

Relevant log output

No response

Anything else we need to know?

I've attached 4 example polygons (2x HUC2, 2x HUC4) in the attached zip file (first text box) to test this bug. After downloading those example gpkg files, they can be read in and tested using the code chunk in the MVCE box. 2 polygons per huc level are provided to test and better demonstrate the error.

Environment

SYS INFO -------- commit: 1eaafd7764ff02f2e819a6b9b42a6814b209cee8 python: 3.11.0 | packaged by conda-forge | (main, Jan 15 2023, 05:44:48) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.12.2 libnetcdf: 4.9.1 PACKAGE VERSION ------------------------------- aiodns N/A aiohttp 3.8.4 aiohttp-client-cache 0.8.1 aiosqlite 0.19.0 async-retriever 0.15.0 bottleneck 1.3.7 ... xarray 2023.4.2 xdist N/A yaml N/A -------------------------------
cheginit commented 1 year ago

Thanks for reporting the issue. I will look into it and let you know.

cheginit commented 1 year ago

The issue is that the geometry of the HUC2 that you're passing to the function is too complex so when it converts the geometry to string the web service cannot handle it. A simple solution is to simplify the geometry before passing it to the function. For example:

geom = polygon1_huc2.reset_index().geometry[0].simplify(1e-3)

Then pass it to the function.

cheginit commented 1 year ago

Please reopen if you still have this issue.