DOI-USGS / hyRefactor

https://code.usgs.gov/wma/nhgf/reference-fabric/hyrefactor
Creative Commons Zero v1.0 Universal
5 stars 0 forks source link

Improve performance of aggregate_network - support additional use cases. #23

Closed dblodgett-usgs closed 2 years ago

dblodgett-usgs commented 3 years ago

Been pondering this one for a while.

The aggregate_network function takes two inputs: a flowpath network and a set of locations on the network. It returns a list with one element per aggregate catchment that minimally satisfies the set of locations provided. Each element of the list contains two elements: the collection of flowpaths that join the inflow and outflow network locations and the collection of all flowpaths in the aggregate catchment bounded by the inflow and outflow network locations.

_side note: The aggregatecatchment function adds a third element -- the catchment boundary that encompases the aggregate catchment

In conversation with @cheginit -- it seems that there are opportunities to improve performance and refine the internal function of aggregate_network. @mikejohnson51 has worked up some improved catchment aggregation functionality that is already in the package but further refinement of the aggregate__catchment function may be possible.

The major opportunity here is to eliminate the looping BFS searches here: https://github.com/dblodgett-usgs/hyRefactor/blob/master/R/aggregate_network.R#L157

A second opportunity would be to use a more direct shortest path search over the whole network to identify main-stem paths independent of the full aggregate sub-graphs. This would allow the algorithm to return either main paths or aggregate collections in a more flexible workflow.

The two use cases here are a minimal connected graph (say a set of dams where we just want to know how they relate to eachother). and catchment aggregation (where we just want the catchments that need to combine together for a set of nexus locations).

to be continued...

dblodgett-usgs commented 2 years ago

Done in #25