SouthForkResearch / pyGNAT

Geomorphic Network and Analysis Toolbox, redesigned using FOSS python libraries.
MIT License
2 stars 0 forks source link

Find duplicate reaches #29

Closed jesselangdon closed 7 years ago

jesselangdon commented 7 years ago

pyGNAT needs to be able to find duplicate stream reaches in a network (which can happen occasionally). Not sure if this could be done with NetworkX or PyQGIS.

MattReimer commented 7 years ago

So networkX filters out duplicates unless they are flowing in opposite directions.

I think we'll probably need to use Shapely for this. Can we loop over all line segments and see if their start and end points match any other line segments?

then you could just use a python list trick to find the duplicates

Something like this (sorry, it's just pseudocode)

from collections import Counter
# get all the start and end points in a big list
endbits = [(line.start, line.end) for line in shapefile]
# find any duplicates
duplicates = [k for k,v in Counter(endbits).items() if v>1]
jesselangdon commented 7 years ago

I'm thinking that since duplicates are filtered out, this kind of solves our problem. Why do we need to identify them?

MattReimer commented 7 years ago

@jesselangdon We can talk about this when I come down but there is a good case for it when considering braids. What you consider a duplicate reach might disagree with what networkx thinks it is.

I've started using MultiDiGraphs for everything for this very reason. It adds a little extra complexity but I think there's a payoff in the end.

jesselangdon commented 7 years ago

Find duplicate errors method written. See commit https://github.com/SouthForkResearch/pyGNAT/commit/a245880a3ce0710b9fec22255ae1c6626e725a08