hyriver / pygeohydro

A part of HyRiver software stack for accessing hydrology data through web services
https://docs.hyriver.io
Other
68 stars 23 forks source link

USGS STN Flood Event Data Retriever #108

Closed fernando-aristizabal closed 1 year ago

fernando-aristizabal commented 1 year ago

@cheginit,

This draft PR is in response to the discussion post in HyRiver.

I've updated to incorporate the suggested comments including:

I've also had to make a number of changes with respect to datums based on conversations with the USGS STN team.

Some more work is required including:

With respect to the URLs, I noticed the class RESTfulURLs in pygeoogc/pygeoogc.py has many RESTful URLs for use across the HyRiver ecosystem. Would this be the best location for the URLs proposed in this PR?

Lastly, any advice or information you have on documentation and testing would be helpful.

TO DO

cheginit commented 1 year ago

Thanks for the PR, appreciate it. Overall, it looks great!

I added a couple of comments.

Regarding the testing, if you run nox -s tests -- -k TestSTNFloodEventData you should be able to get a report of the lines that haven't been tested. The closer to 100% the better, but generally more than 80% should be good.

fernando-aristizabal commented 1 year ago

Hello @cheginit, I've made quite a few changes to this still draft PR especially with respect to testing.

  1. I've fully implemented TestSTNFloodEventData within tests/test_pygeohydro.py. All tests are passing and the command pytest --cov=pygeohydro --cov-report term-missing tests/test_pygeohydro.py::TestSTNFloodEventData reveals 38 passed individual tests with coverage at 99% (pygeohydro/stnfloodevents.py 106 2 48 0 99% 26-27).

    • The only uncovered lines are with respect to a relative import of a custom exception within the exception.py module. I've noticed that other modules within pygeohydro, such as waterdata.py, import this exception simply as from pygeohydro.exceptions import InputValueError. However, when I try it within the package and within stnfloodevents.py it yields an import error.
      ~/projects/pygeohydro/pygeohydro$ python3
      Python 3.11.4 | packaged by conda-forge | (main, Jun 10 2023, 18:08:17) [GCC 12.2.0] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>> from pygeohydro.exceptions import InputValueError
      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/fernandoa/projects/pygeohydro/pygeohydro/pygeohydro.py", line 27, in <module>
      from pygeohydro import helpers
      ImportError: cannot import name 'helpers' from partially initialized module 'pygeohydro' (most likely due to a circular import) (/home/fernandoa/projects/pygeohydro/pygeohydro/pygeohydro.py)
  2. For another naive admission, I've struggled a bit with your suggestion to use nox. I had to install nox first via python3 -m pip install nox then install python3.8 on my system. I'm currently at the following error:

$ nox -s tests -- -k TestSTNFloodEventData
nox > Running session tests-3.8
nox > Creating virtual environment (virtualenv) using python3.8 in .nox/tests-3-8
nox > python -m pip install '.[test,stac]' git+https://github.com/hyriver/async-retriever.git git+https://github.com/hyriver/pygeoogc.git git+https://github.com/hyriver/pygeoutils.git git+https://github.com/hyriver/pynhd.git git+https://github.com/hyriver/hydrosignatures.git
nox > Command python -m pip install '.[test,stac]' git+https://github.com/hyriver/async-retriever.git git+https://github.com/hyriver/pygeoogc.git git+https://github.com/hyriver/pygeoutils.git git+https://github.com/hyriver/pynhd.git git+https://github.com/hyriver/hydrosignatures.git failed with exit code 1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/__main__.py", line 22, in <module>
    from pip._internal.cli.main import main as _main
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/cli/main.py", line 10, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/cli/autocompletion.py", line 10, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/cli/main_parser.py", line 9, in <module>
    from pip._internal.build_env import get_runnable_pip
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/build_env.py", line 19, in <module>
    from pip._internal.cli.spinners import open_spinner
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/cli/spinners.py", line 9, in <module>
    from pip._internal.utils.logging import get_indentation
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/utils/logging.py", line 29, in <module>
    from pip._internal.utils.misc import ensure_dir
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/utils/misc.py", line 41, in <module>
    from pip._internal.locations import get_major_minor_version
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/locations/__init__.py", line 66, in <module>
    from . import _distutils
  File "/home/fernandoa/projects/pygeohydro/.nox/tests-3-8/lib/python3.8/site-packages/pip/_internal/locations/_distutils.py", line 20, in <module>
    from distutils.cmd import Command as DistutilsCommand
ModuleNotFoundError: No module named 'distutils.cmd'
nox > Session tests-3.8 failed.
  1. In your previous message, it stated that you "added some comments". For some reason, I'm not seeing those on my end.
  2. My next focus will be to implement some documentation for this specifically the reference API and possibly an example notebook. Any advice on doing this? Additionally, I plan to move some of the URLs to pygeoogc to model some of your architecture.

Thanks for taking the time to review and consider. Hopefully we are getting closer to releasing.

cheginit commented 1 year ago

Fernando,

Thanks for improving the code and adding more tests, they look great to me!

Regarding my comments, if you scroll up here or go to the Files Changed tab, you should be able to see all my comments that I added to specific lines of your codes. For example:

image

I should have provided more info about working with nox, I thought you've used it before. My development workflow for using nox is, first, creating a new env using mamba like this:

mamba create -n py38 python=3.8 nox tomli codespell

Then, I activate the environment and run the nox command:

mamba activate py38
nox -s tests -- -k TestSTNFloodEventData

Using nox prevents issues such as the import error that you're having.

Regarding documentation, including all the required info in the docstrings is enough. Sometimes, when I add a new service, I demonstrate its capabilities through an example Jupyter notebook in the Examples repo. This is optional, those notebooks are a way of communicating the users HyRiver's functionalities. Some examples are elaborate some are very simple, so there's no specific format. All notebooks in the repo will be rendered and shown here.

cheginit commented 1 year ago

Here are the comments that I mentioned earlier:

if as_list:
    return ...
if ...:
    return ...
x_column, ...
return ...
fernando-aristizabal commented 1 year ago

@cheginit

Thanks for all the feedback and for walking me through contributing to HyRiver. I've addressed several of the issues you've raised in addition to other changes as well. Please see my items below:

  • [ ] To be consistent with typing in HyRiver, please add from __future__ import annotations at the top of the file, and use Python 3.10 typing style. So instead of importing Dict, List, Optional, and Union you can use dict, list, | None, and |, in the code, respectively.
  • [ ] For types in docstrings, please use numpy style to make sure it gets rendered correctly for the website. So, for example, instead of Set[str] you use set of str. Please follow the same approach for the rest.
  1. All done.
  • [ ] Instead of urljoin, it'd better to use either a f-string (f"{base_url}/STNServices") or use yarl package if you want to do some URL-specific operations (this package is already a dependency). For yarl, you can simply use the f-string as an input to URL class from yarl. It offers some efficient URL operations such as adding or modifying key-params.
  1. This has been changed to the f-string method where applicable. However, the service_url and data_dictionary_url class attributes are now accessible via pygeoogc so I've removed the url operations for those. Instead I've added the following lines:
# will uncomment when v0.15.1 pygeoogc is released
# service_url = ServiceURL().restful.stnflood
# data_dictionary_url = ServiceURL().restful.stnflood_dd

These lines are commented out because I believe they won't work on the user side since user installs are from package releases on pypi. As the latest pygeoogc release doesn't include the relevant urls I've tentatively kept the local url definition in place. However, these pygeoogc urls do pass nox tests as nox appears to pull dependencies from the latest GitHub commit.

  • [ ] For documentation, you can add an Examples section to docstrings of public methods and functions. It's not necessary, so you don't have to add it, just something to keep in mind. You check this for an example.
  1. Great! Thanks for the tip. I'm currently working on an example notebook in this location. I would like to submit a draft PR on that prior to having this PR accepted. It will give me an opportunity to deep dive the usage and make any changes prior to merging.
  • [ ] When you're using a if-return block (e.g., Lines 356 and 496) there is no need for elif, you can use another if-return block. Also, there's no need for else. So, you will end up with something like:
if as_list:
    return ...
if ...:
    return ...
x_column, ...
return ...
  1. Done

So in summary, these are the following pending items.

Thanks again! Let me know if there is anything else.

cheginit commented 1 year ago

Changes look great to me!

Regarding pygeoogc, you don't have to wait for a new PyPi release. I set up nox in a way that it installs pygeoogc (and all the other HyRiver libraries) from its repo, so technically tests that you run with nox should pass, unless there's something wrong. Even for your own development you can install the latest version from source using pip in a conda env like so pip install --no-deps git+https://github.com/hyriver/pygeoogc.

fernando-aristizabal commented 1 year ago

Thanks for the tip! I've updated the urls to depend on pygeoogc. I'll get some of the examples up in a few days hopefully in a draft PR and go from there.

fernando-aristizabal commented 1 year ago

@cheginit

Hyriver PR #21 includes a draft PR for some examples using the functionality introduced in this PR. Unless you have any major points of feedback requiring a change in this PR, I believe this PR is ready to merge on my end.

cheginit commented 1 year ago

Thanks for your substantial efforts!