hyriver / pygeohydro

A part of HyRiver software stack for accessing hydrology data through web services
https://docs.hyriver.io
Other
68 stars 23 forks source link

`pygeohydro.soil_gnatsgo()` function erroring out #116

Closed sheilasaia closed 5 months ago

sheilasaia commented 8 months ago

What happened?

I was following the soil storage capacity tutorial and was unable to run the pygeohydro.soil_gnatsgo() function without getting errors. My code is below. I'm not sure if this is an issue with my conda environment or something with the gnatsgo database related so I've also provided my environment yaml and the error message. It's uploaded as a text file because I couldn't upload it as a .yaml file.

I'm on Windows 11 with Python 3.12.2. I've defined my PROJ_LIB and PROJ_DATA paths at the start of the script and those should be pointing to the environment created by the yaml. I've tried to update rasterio with mamba with mamba update rasterio, but it's telling me everything is up-to-date.

What did you expect to happen?

I expected to have a similar output for soil thickness as is shown in the soil storage capacity tutorial, but my code is erroring out and I'm not able to access the thickness data from gNATSGO. I've tried other STAC variables as well, but have had no luck with those either.

Minimal Complete Verifiable Example

# get basin
test_basin = pynhd.NLDI().get_basins("11092450")

# get basin wkt string
test_basin_rasterio_wkt = rasterio.crs.CRS.from_wkt(test_basin.crs.to_wkt())

# get basin geom
test_basin_geom = test_basin.geometry["USGS-11092450"]

# get soil properties data (this works fine)
test_soils_data = pygeohydro.soil_properties() # this runs with rasterio warnings but gives result

# mask soil properties data with basin geom
test_soils_data_mask = pygeoutils.xarray_geomask(test_soil_data, test_basin_geom, test_basin_rasterio_wkt)
# i kept getting rasterio errors if i used test_basin.crs.to_wkt() here rather than test_basin_rasterio_wkt

# get soil thickness data
test_thickness_data = pygeohydro.soil_gnatsgo("tk0_999a", test_field_geom, test_basin_rasterio_wkt)
# this has similar rasterio warnings as above but errors out with more rasterio errors

MVCE confirmation

Relevant log output

Error messages:

>>> test_mukey_data = pygeohydro.soil_gnatsgo("tk0_999a", test_field_geom, test_basin_rasterio_wkt)

WARNING:rasterio._env:CPLE_AppDefined in PROJ: proj_create_from_database: SQLite error on SELECT name, type, coordinate_system_auth_name, coordinate_system_code, datum_auth_name, datum_code, area_of_use_auth_name, area_of_use_code, text_definition, deprecated FROM geodetic_crs WHERE auth_name = ? AND code = ?: no such column: area_of_use_auth_name
WARNING:rasterio._env:CPLE_AppDefined in PROJ: proj_create_from_database: SQLite error on SELECT name, ellipsoid_auth_name, ellipsoid_code, prime_meridian_auth_name, prime_meridian_code, area_of_use_auth_name, area_of_use_code, deprecated FROM geodetic_datum WHERE auth_name = ? AND code = ?: no such column: area_of_use_auth_name
Traceback (most recent call last):
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\backends\file_manager.py", line 211, in _acquire_with_cache_info
    file = self._cache[self._key]
           ~~~~~~~~~~~^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\backends\lru_cache.py", line 56, in __getitem__
    value = self._cache[key]
            ~~~~~~~~~~~^^^^^
KeyError: [<function open at 0x000001F4F2AAE3E0>, (WindowsPath('cache/005089ad56d76b182f3308ea5dc486455e1b3e28e2af21f4c554edab4c89a04a.tiff'),), 'r', (('sharing', False),), '69f84f8f-772c-4148-8a69-80468c6750b4']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "rasterio\\crs.pyx", line 730, in rasterio.crs.CRS.from_wkt
  File "rasterio\\_err.pyx", line 209, in rasterio._err.exc_wrap_ogrerr
rasterio._err.CPLE_BaseError: OGR Error code 5

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\pygeohydro\pygeohydro.py", line 800, in soil_gnatsgo
    ds = xr.merge((get_layer(lyr) for lyr in lyrs), combine_attrs="drop_conflicts")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\core\merge.py", line 1015, in merge
    for obj in objects:
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\pygeohydro\pygeohydro.py", line 800, in <genexpr>
    ds = xr.merge((get_layer(lyr) for lyr in lyrs), combine_attrs="drop_conflicts")
                   ^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\pygeohydro\pygeohydro.py", line 791, in get_layer
    ds = xr.merge(_open_tiff(f, lyr) for f in fpaths)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\core\merge.py", line 1015, in merge
    for obj in objects:
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\pygeohydro\pygeohydro.py", line 791, in <genexpr>
    ds = xr.merge(_open_tiff(f, lyr) for f in fpaths)
                  ^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\pygeohydro\pygeohydro.py", line 740, in _open_tiff
    ds = rxr.open_rasterio(file)
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\rioxarray\_io.py", line 1124, in open_rasterio
    riods = manager.acquire()
            ^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\backends\file_manager.py", line 193, in acquire
    file, _ = self._acquire_with_cache_info(needs_lock)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\xarray\backends\file_manager.py", line 217, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\rasterio\env.py", line 451, in wrapper
    return f(*args, **kwds)
           ^^^^^^^^^^^^^^^^
  File "C:\Users\sheila.saia\AppData\Local\anaconda3\envs\esmc_env\Lib\site-packages\rasterio\__init__.py", line 304, in open
    dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "rasterio\\_base.pyx", line 331, in rasterio._base.DatasetBase.__init__
  File "rasterio\\_base.pyx", line 350, in rasterio._base.DatasetBase._set_attrs_from_dataset_handle
  File "rasterio\\_base.pyx", line 408, in rasterio._base.DatasetBase.read_crs
  File "rasterio\\_base.pyx", line 385, in rasterio._base.DatasetBase._handle_crswkt
  File "rasterio\\crs.pyx", line 732, in rasterio.crs.CRS.from_wkt
rasterio.errors.CRSError: The WKT could not be parsed. OGR Error code 5

Anything else we need to know?

Thank you for creating the HyRiver python tools and the tutorials! They are very helpful and I'm excited to use them more in my work. πŸ’¦πŸ

Environment

sheila_env_yaml.txt

cheginit commented 8 months ago

I couldn't reproduce your error. This works without any warning or error:

image

Can you please show the output of pygeohydro.show_versions().

sheilasaia commented 8 months ago

Thank you for looking into the error and my sincere appologies for the delayed reply. My output for pygeohydro.show_versions() is below:

>>> pygeohydro.show_versions()

SYS INFO
--------
commit: 2fde6ddc277ae97cfc921e07e8de0b691e84853c
python: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:42:31) [MSC v.1937 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 11
machine: AMD64
processor: Intel64 Family 6 Model 186 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('English_United States', '1252')

PACKAGE                VERSION
-------------------------------
async-retriever        0.16.0
pygeoogc               0.16.1
pygeoutils             0.16.1
py3dep                 N/A
pynhd                  0.16.2
pygridmet              N/A
pydaymet               N/A
hydrosignatures        0.16.0
pynldas2               N/A
pygeohydro             0.16.0
aiohttp                3.9.3
aiohttp-client-cache   0.11.0
aiosqlite              0.18.0
cytoolz                0.12.2
ujson                  5.9.0
defusedxml             0.7.1
joblib                 1.3.2
multidict              6.0.4
owslib                 0.29.3
pyproj                 3.6.1
requests               2.31.0
requests-cache         1.2.0
shapely                2.0.3
url-normalize          1.4.3
urllib3                2.2.1
yarl                   1.9.3
geopandas              0.14.3
netcdf4                1.6.5
numpy                  1.26.4
rasterio               1.3.9
rioxarray              0.15.1
scipy                  1.12.0
xarray                 2023.6.0
click                  8.1.7
pyflwdir               N/A
networkx               3.2.1
pyarrow                11.0.0
folium                 0.15.1
h5netcdf               1.2.0
matplotlib             3.8.3
pandas                 2.2.0
numba                  0.59.0
bottleneck             N/A
py7zr                  0.20.8
pyogrio                0.7.2
-------------------------------
cheginit commented 8 months ago

I just created a new environment just to make sure, I am using the same version as yours, and it ran without any issue again.

This is how created the env:

micromamba create -n test pygeohydro ipykernel ipywidgets pystac-client planetary-computer
image

Can you try again, maybe at the time you were running your code, there was a server-side issue.

cheginit commented 5 months ago

Since there haven't been any updates on this, and I couldn't reproduce the error, I will close this. You are welcome to re-open if the issue persists.