VeinsOfTheEarth / RivGraph

Extracting and quantifying graphical representations of river and delta channel networks from binary masks
https://veinsoftheearth.github.io/RivGraph/
Other
84 stars 26 forks source link

remove_two_link_nodes encountering self loop or naming issue #7

Closed Lvulis closed 3 years ago

Lvulis commented 4 years ago

I am trying to prune the network on the Yukon data that I have been using and continuously get this error in the remove_two_link_nodes. To get here, I extract the network up to prune_network as in the example (same as all other comments this week). I then run the functions in remove_all_spurs in the global environment (i.e. not the function remove all spurs but the manual parts). Check 2nd code block.

 ct = 0
        # Remove spurs
        for nid, con in zip(nodes['id'], nodes['conn']):

            if len(con) == 1 and nid not in dontremove:
                print(nid)
                ct = ct + 1
                links, nodes = delete_link(links, nodes, con[0])

        # Remove self-looping links (a link that starts and ends at the same node)
        for nid, con in zip(nodes['id'], nodes['conn']):
            m = mode(con)
            if m.count[0] > 1:

                # Get link 
                looplink = m.mode[0]                

                # Delete link
                links, nodes = delete_link(links, nodes, looplink)
                ct = ct + 1             

            linkkeys = [lk for lk in links.keys() if type(links[lk]) is not int and len(links[lk]) == len(links['id'])]
ct = 0
        for nidx, nid in enumerate(nodes['id']):
            # Get connectivity of current node        
            conn = nodes['conn'][nidx][:]
            # We want to combine links where a node has only two connections
            if len(conn) == 2 and nid not in dontremove:
                print(nid)
                # Delete the node
                nodes = delete_node(nodes, nid, warn=False)

                # The first link in conn will be absorbed by the second
                lid_go = conn[0]
                lid_stay = conn[1]
                lgo_idx = links['id'].index(lid_go)
                lstay_idx = links['id'].index(lid_stay)

                # Update the connectivity of the node attached to the link being absorbed
                conn_go = links['conn'][lgo_idx]
                conn_stay = links['conn'][lstay_idx]

                # Update node connectivty of go link (stay link doesn't need updating)
                node_id_go = (set(conn_go) - set([nid])).pop()
                nodes['conn'][nodes['id'].index(node_id_go)].remove(lid_go)
                nodes['conn'][nodes['id'].index(node_id_go)].append(lid_stay)

                # Update link connectivity of stay link
                conn_go.remove(nid)
                conn_stay.remove(nid)
                # Add the "go" link connectivity to the "stay" link
                conn_stay.extend(conn_go)

                # Update the indices of the link
                idcs_go = links['idx'][lgo_idx]
                idcs_stay = links['idx'][lstay_idx]
                if idcs_go[0] == idcs_stay[-1]:
                    new_idcs = idcs_stay[:-1] + idcs_go
                elif idcs_go[-1] == idcs_stay[0]:
                    new_idcs = idcs_go[:-1] + idcs_stay
                elif idcs_go[0] ==  idcs_stay[0]:
                    new_idcs = idcs_stay[::-1][:-1] + idcs_go
                elif idcs_go[-1] == idcs_stay[-1]:
                    new_idcs = idcs_stay[:-1] + idcs_go[::-1]
                links['idx'][lstay_idx] = new_idcs

                # Delete the "go" link
                lidx_go = links['id'].index(lid_go)
                for lk in linkkeys:
                    if lk == 'id': # have to treat orderedset differently
                        links[lk].remove(lid_go)
                    elif type(links[lk]) is np.ndarray: # have to treat numpy arrays differently
                        links[lk] = np.delete(links[lk], lid_go)
                    else:
                        links[lk].pop(lidx_go)

                ct = ct + 1

Traceback (most recent call last):

  File "<ipython-input-297-4805c3adaf0e>", line 22, in <module>
    node_id_go = (set(conn_go) - set([nid])).pop()

KeyError: 'pop from an empty set'

This happens to node 29223:

nid
Out[309]: 29223

conn
Out[310]: [16301, 16301]

lid_go
Out[311]: 16301

lid_stay
Out[313]: 16301

It seems the prior call to remove self loops should have removed it. This seems like a RG bug, although not sure. Originally this corresponded to a node with 3 links, one of which is a spur... my python skills are not up there to get this taken care of quickly, and you may be able to get a solution to this faster.

I previously uploaded some links/nodes, not sure if those had been partially trimmed. I am going to upload again: links, nodes, skeleton (all before pruning): yukon_beforepruning.zip the mask used: compoun_binary.zip

jonschwenk commented 4 years ago

Please upload your shoreline and inlet nodes shapefiles as well.

Lvulis commented 4 years ago

Yukon_Shoreline_Inlets.zip

jonschwenk commented 4 years ago

Ok. The problem is that your mask is not rivgraphable as-is. Your mask should contain a single connected-component. You should also disconnect problematic pieces from the network. I included a number of tools in rivgraph to help you do this:

from rivgraph import im_utils as im
from rivgraph import io_utils as io
from matplotlib import pyplot as plt
import gdal

# Set the paths
unclean_mask = r"X:\RivGraph\Bugs\issue_7\compoun_binary.tif"
clean_mask = r"X:\RivGraph\Bugs\issue_7\Yukon_clean.tif"

# Clean the mask
gdobj = gdal.Open(unclean_mask) # creates a gdal object, we'll use this later when saving
I = gdobj.ReadAsArray() # load the uncleaned mask as a numpy array
I = im.largest_blobs(I, nlargest=1, action='keep') # Keep only the largest connected component
I = im.hand_clean(I, action='erase') # I ran this three times to disconnect portions of the network # This opens a plot window. You should draw a polygon by clicking points and pressing enter. Right-click to remove the last point you placed (this is important when you need to click somewhere to zoom in or pan). Press enter when done. You can only draw one polygon at a time. Re-run as many times as you need, and choose action='fill' to fill the polygon instead of erase it.
I = im.largest_blobs(I, nlargest=1, action='keep') # Rerun to keep the largest blob again after hand-cleaning.
I = im.fill_holes(I, maxholesize=20) # This is optional; fills islands of size 20 pixels or less. Results in a less-noisy network.
io.write_geotiff(I, gdobj.GetGeoTransform(), gdobj.GetProjection(), clean_mask, dtype=gdal.GDT_Byte) # Write the cleaned mask to disk

Using the shoreline and inlet nodes shapefile I uploaded (I did not use yours), I was able to run the remaining RivGraph functions (all the way through setting directionality). There is a cycle in the graph that RivGraph could not resolve.

Download all the results here.

Thanks for bringing this up--this is useful information that should be included in the mask preparation documentation.

The reason you're getting the "bug" you report is that your mask results in many disconnected networks, since each disconnected blog is its own network. Some of these are single-link or single-pixel networks, which result in the error you see. One idea would be to have rivgraph throw a warning when multiple networks are present in the mask.

I also note that there are a number of channels that are barely disconnected in your mask. These will get pruned if you don't connect them. You can join them using the hand_clean function.

Lvulis commented 4 years ago

OK so first order processing will be to actually remove any lakes from the scene before processing? So disconnected reaches will lead to running into this issue also?

As a feature request having biggest blob removal on the mask going into RG would be super helpful (I see all the functions are already there, so maybe just documenting that more clearly). For example the hand cleaning and fill holes function.

Good to know on the last point you just added - will result in a much better DCN. Is there any hope in getting those connected automatically, or need to go to higher quality channel masks?

Lvulis commented 4 years ago

As a comment (or can make another issue) I get an error on the hand clean which again seems to arise from the package discrepancies :(

I = im.hand_clean(I, action='erase')
C:\Anaconda3\envs\geo_py37\lib\site-packages\rivgraph-0.2-py3.7.egg\rivgraph\im_utils.py:1121: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
jonschwenk commented 4 years ago

That error is not a package problem. I assume you're using Spyder. Go To Tools->Preferences->IPython Console and click the "Graphics" tab. Change the Graphics Backend to "Automatic". This will make plots in a separate window that you can interact with.

Note that you can also edit the mask in Photoshop or any other program that supports .tif files, but you'll need to re-append the georeferencing information when you're finished. QGIS has a plugin that allows for single-pixel-at-a-time editing, so it's a little time-consuming, but it works (and retains georeferencing info). I don't recall its name.

Lvulis commented 4 years ago

This worked! Thanks for the cleaning code & the technical support :)

jonschwenk commented 4 years ago

OK so first order processing will be to actually remove any lakes from the scene before processing? So disconnected reaches will lead to running into this issue also?

As a feature request having biggest blob removal on the mask going into RG would be super helpful (I see all the functions are already there, so maybe just documenting that more clearly). For example the hand cleaning and fill holes function.

Good to know on the last point you just added - will result in a much better DCN. Is there any hope in getting those connected automatically, or need to go to higher quality channel masks?

I just realized I skipped this. So originally, I had RivGraph automatically consider the largest blob in the mask to be the channel network. After some testing, it occurred to me that the tools for mask preparation should be included but not automated--cases started popping up where the automation didn't produce the desired results. So yes, this is another example to add to the documentation.

Lvulis commented 4 years ago

Awesome!

jonschwenk commented 3 years ago

This has been addressed in the docs now: https://jonschwenk.github.io/RivGraph/maskmaking/index.html