Scripts to explore and process global hydrography (stream lines and basin boundaries) for Model My Watershed.
TDX-Hydro is the best available global hydrographic datasuite, first released to the public in summer of 2023 by the US National Geospatial-Intelligence Agency (NGA) in collaboration with USACE ERDC and NASA, and derived from the 12 m resolution TanDEM-X elevation data.
The GEOGlOWS ECMWF Streamflow Model project is building their v2.0 release around a modified version of TDX-Hydro with added attributes (i.e. topological order) and slightly simplified headwater streamlines for improved modeling and mapping. The GEOGLOWS v2 Data Guide provides useful information and tutorials relevant to using TDX-Hydro data.
TDX-Hydro was built around HydroSHEDS v1 HydroBASINS Level 2 boundaries (continental sub-units). HydroSHEDS v2 will be developed from the same TanDEM-X elevation data used by TDX-Hydro.
Project Objectives: Develop Model My Watershed hydrographic capabilities over most of the world to:
Objectives for this repo:
This repo is still under development and has not yet been packaged for widespread use.
Follow these steps to install using the conda package manager.
We recommend installing the light-weight Miniconda that includes Python, the conda environment and package management system, and their dependencies.
If you have already installed the Anaconda Distribution, you can use it to complete the next steps, but you may need to update to the latest version.
From this Github page, click on the green "Code" dropdown button near the upper right. Select to either "Open in GitHub Desktop" (i.e. git clone) or "Download ZIP". We recommend using GitHub Desktop, to most easily receive updates.
Place your copy of this repo in any convenient location on your computer.
We recommend creating a custom virtual environment with the same software dependencies that we've used in development and testing, as listed in the environment.yml
file.
Create a project-specific environment using this conda command in your terminal or Anaconda Prompt console. If necessary, replace environment.yml
with the full file pathway to the environment.yml
file in the local cloned repository.
conda env create --file environment.yml
Alternatively, use the faster libmamba
solver with:
conda env create -f environment.yml --solver=libmamba
Activate the environment using the instructions printed by conda after the environment is created successfully.
To update your environment run the following command:
conda env update -f environment.yml --solver=libmamba --prune
To have access to this repository's modules in your Python environments, it is necessary to save the path to your copy of this repo in Miniconda's or Anaconda's conda.pth
file in the environment's site-packages
directory (i.e. something like <$HOME>/anaconda/lib/pythonX.X/site-packages/conda.pth
or <$HOME>/miniconda3/envs/drwi_pa/lib/python3.11/site-packages/conda.pth
or similar), where <$HOME>
refers to the full path of the user directory, such as /home/username
on Linux/Mac.
The easiest way to do this is to use the conda develop command in the console or terminal like this, replacing /path/to/module/
with the full file pathway to the local cloned HSPsquared repository:
conda develop /path/to/module/
You should now be able to run the examples and create your own Jupyter Notebooks!
The repository provides an implementation of a modified nested set index algorithm based on the work of Haag and Shokoufandeh (2019). This algorithm is a modified depth-first search algorithm that visits each node twice; recording both the discover time (number of steps to visit the node once) and the finish time (number of steps to visit the node a second time). The discover and finish time values provide a method to select all upstream elements on a target node by selected nodes where the discovery time >= target discover time
and the finish time <= target finish time
.
Consider the following graph representing a watershed, for which the modified nested set algorithm has been applied. The discover time (d) and finish time (f) values are reported next to the nodes.
If we wanted to select the upstream elements for the node boxed in red, we can use the discover and finish time values. The elements in blue satisfy the conditions d>=5
(with 5 being the d value from the selected node) and f<=9
(with 9 being the f value of the selected node).