jkibele / pyriv

Python library for analysis of minimum aquatic distance across rivers and coasts. It's like Google Maps for anadromous fish.
MIT License
1 stars 1 forks source link

Clean up and merge CoastDist branch #15

Closed jkibele closed 6 years ago

jkibele commented 6 years ago

In my frantic effort to finish the river herring distance calculations for the east coast of the US, I've made a huge mess of the CoastDist branch. I've made some changes to pyriv.coastal and pyriv.river_graph, but mostly I've put stuff into the poorly named pyriv.rg_light module. This was sort of intended to be a stripped down (in that it doesn't really use the coastline the same way) version of river_graph, but it's kind of not that now. It's a mess.

I've got a couple of jupyter notebooks that demonstrate how the code was used for the east coast stuff. I think the goal for resolving this issue should be to (unfortunately) pretty much refactor the code so that there's a more sensible workflow possible. There are two main tasks here that pyriv is trying to accomplish:

  1. Build the network used to calculate distances. This includes the following types of distances: between sites on a single river network (i.e., no river mouth between sites); between sites on different river networks (i.e., from site to river mouth + ocean distance + upriver distance if site is up a different river); between sites on the coast.
  2. Actually calculate the distances using the built network.

Task 1 is going to be potentially complex and annoying. I don't think there's anyway to truly make that user friendly. There's too much variability in the potential input data sets (e.g., hydrographic network shapefiles, coastline features). Task 2, on the other hand, could be made fairly simple and could go into some sort of easy to use GUI (e.g., QGIS processing toolbox or ESRI Python toolbox).

I'm going to see if I can paste my jupyter notebooks for the east coast calculations up here.

jkibele commented 6 years ago

Couldn't paste them directly so here's a gist of network prep and here's one of error checking and transformation to matrix format.

jkibele commented 6 years ago

As part of this, I need to remember that, for some reason, I have node_rounding method in river_graph but not in rg_light. It seems that I need this method so I'll need to keep that in mind when cleaning up my mess.

jkibele commented 6 years ago

I've now merged CoastalDist back into master and updated the readme, but I haven't really done anything about cleaning up the messy code. In thinking about how to approach cleaning this stuff up, it seems to me that there are several tasks that are entwined with clean up:

  1. Committing some test data (i.e., simple river network shapefile and matching simple coastline shapefile)
  2. Writing tests as I build out a cleaned up version
  3. Better defining workflows to guide refactoring and code clean up.

The obvious top-level categories of the workflow are network creation and distance calculations. So, I guess my thinking on the workflow categorization looks something like this at the moment:

  1. Network Creation
    1. River Network
      1. Data import (shp or graphpickle for now)
      2. Addition of coastline (to identify river mouths)
      3. Identification of rivermouths and inland deadends
      4. Pruning useless bits off the graph
      5. Fixing unintended breaks
      6. Node coordinate rounding
    2. Coast Network
      1. Options for handling lines or polygons (multipart and single part)
      2. Simplification and node count metrics for prediction of processing time
      3. Edge creation (multiprocessor command line scripts?)
    3. Network Join
  2. Distance Calculations
    1. MAD to river mouth (using just the river network)
    2. MAD matrix (full network)
    3. Tabular results formatting and saving stuff to files.

That's all I have time for at the moment. I'll need to continue thinking and then break this into smaller tasks.

jkibele commented 6 years ago

Since I've already merged the CoastalDist branch and started this other project about the re-organization, I'm going to close this issue.