hyriver / async-retriever

A part of HyRiver software stack for asynchronous requests with persistent caching
https://docs.hyriver.io
Other
4 stars 2 forks source link

IPython crashes on call to `retrieve` #50

Open wmcaliley-usgs opened 4 days ago

wmcaliley-usgs commented 4 days ago

What happened?

I've been trying to use pynhd to download HUC 8 data from IPython, but IPython crashes with the message "RuntimeError: Event loop is closed". I only experience this bug in IPython - no problem at the Python prompt, or Jupyter, or even using ipdb within IPython.

What did you expect to happen?

I expected HUC 8 data to be downloaded without crashing IPython.

Minimal Complete Verifiable Example

# Run from IPython
from async_retriever import retrieve_text

urls = ['https://labs.waterdata.usgs.gov/geoserver/wmadata/ows']
payload = {
 'service': 'wfs',
 'version': '2.0.0',
 'outputFormat': 'text/xml',
 'request': 'GetFeature',
 'typeName': 'wmadata:huc08',
 'bbox': '45.947037,-115.06538,47.572536,-112.692334,EPSG:4326',
 'srsName': 'EPSG:4269',
 'resultType': 'hits'}

resp = retrieve_text(urls, [{"params": payload}])

MVCE confirmation

Relevant log output

Error in sys.excepthook:
Traceback (most recent call last):
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/core/application.py", line 288, in excepthook
    return self.crash_handler(etype, evalue, tb)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/core/crashhandler.py", line 163, in __call__
    if rptdir is None or not Path.is_dir(rptdir):
                             ^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/pathlib.py", line 876, in is_dir
    return S_ISDIR(self.stat().st_mode)
                   ^^^^^^^^^
AttributeError: 'str' object has no attribute 'stat'

Original exception was:
Traceback (most recent call last):
  File "/Users/wmcaliley/miniforge3/envs/conus404/bin/ipython", line 10, in <module>
    sys.exit(start_ipython())
             ^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/__init__.py", line 130, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/terminal/ipapp.py", line 317, in start
    self.shell.mainloop()
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/terminal/interactiveshell.py", line 917, in mainloop
    self.interact()
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/terminal/interactiveshell.py", line 902, in interact
    code = self.prompt_for_code()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/IPython/terminal/interactiveshell.py", line 845, in prompt_for_code
    text = self.pt_app.prompt(
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/prompt_toolkit/shortcuts/prompt.py", line 1035, in prompt
    return self.app.run(
           ^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/prompt_toolkit/application/application.py", line 1002, in run
    return asyncio.run(coro)
           ^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/site-packages/nest_asyncio.py", line 28, in run
    task = asyncio.ensure_future(main)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/asyncio/tasks.py", line 685, in ensure_future
    return loop.create_task(coro_or_future)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/asyncio/base_events.py", line 434, in create_task
    self._check_closed()
  File "/Users/wmcaliley/miniforge3/envs/conus404/lib/python3.12/asyncio/base_events.py", line 519, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

Anything else we need to know?

Here's another minimal example that better reflects my use case. Again, it's only a problem in IPython.

from pynhd import WaterData
feature = WaterData('huc08').bybox((-115.065380, 45.947037, -112.692334, 47.572536))

I do NOT understand how asyncio or event loops work, but I think that IPython crashes upon loop.close() here, meaning that new_loop is True. Could the problem be that a new loop is created in async_retriever._utils.get_event_loop() when it shouldn't be?

Environment

SYS INFO -------- commit: None python: 3.12.7 | packaged by conda-forge | (main, Oct 4 2024, 15:57:01) [Clang 17.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') PACKAGE VERSION ------------------------------- async-retriever 0.18.0 pygeoogc N/A pygeoutils N/A py3dep N/A pynhd N/A pygridmet N/A pydaymet N/A hydrosignatures N/A pynldas2 N/A pygeohydro N/A aiohttp 3.10.10 aiohttp-client-cache 0.12.4 aiosqlite 0.20.0 cytoolz 1.0.0 ujson 5.10.0 defusedxml N/A joblib N/A multidict 6.1.0 owslib N/A pyproj N/A requests N/A requests-cache N/A shapely N/A url-normalize 1.4.3 urllib3 N/A yarl 1.17.1 geopandas N/A netcdf4 N/A numpy N/A rasterio N/A rioxarray N/A scipy N/A xarray N/A click N/A pyflwdir N/A networkx N/A pyarrow N/A folium N/A h5netcdf N/A matplotlib N/A pandas N/A numba N/A bottleneck N/A py7zr N/A pyogrio N/A -------------------------------
cheginit commented 4 days ago

Thanks for reporting the issue! I just committed a fix. Please give it a try and let me know if this works by installing async-retriever from Git. I will release a new version soon.

The issue is only with running this in IPython, so notebooks are not affected. It appears that IPython does not work with nest-asyncio package. I also realized that, the maintainer of nest-asyncio package sadly passed away, so the package is not maintained anymore.

wmcaliley-usgs commented 4 days ago

Looks great! I tested both the MCVE that only uses async_retriever and the shorter pynhd minimal example in IPython, and both work now with your fix. I also tested the pynhd minimal example at the Python prompt, Jupyter lab, Jupyter console, and Jupyter QT console - all good. Thanks so much!

cheginit commented 4 days ago

Awesome! Thanks for testing it comprehensively! I also tested on Google Colab and it worked.