Open arottersman opened 6 years ago
NHD
40.59147012466978/-111.91257476806639
36.243882/-89.624598
40.273292/-76.237493
35.218156/-101.719055
???/???
DRB
40.288818/-75.868027
40.273292/-76.237493
39.990189/-75.199185
???/???
???/???
Update from @dtarb shows that this is performing as expected with RWD's current capabilities:
I looked into the quirks that I reported myself as I was worried it was related to my watershed delineation code.
My short answer is that the functionality as is, is fine for the current release.
The long answer gets a bit technical, so apologies in advance. I found that the shapefile, and corresponding JSON file that MMW is providing are consistent with the output of RWD. These holes are real internally draining watersheds. RWD removes most internally draining watersheds through using a process of simplification that used an ArcGIS function in the preprocessing. However where internally draining watersheds occur at the boundary between the locally delineated watershed, and the upstream watersheds being merged, these are not eliminated during RWD, due to not being able to use the ArcGIS function in online processing. Thus in the output that RWD delivers to MMW we decided to accept these results. There was an option to use an OGR function to remove these holes, but this resulted in edge slivers, that also were dissatisfying. And this slowed down the code.
MMW is doing post processing on the files that RWD produces. This includes (1) generalization to reduce the complexity of the edges and (2) some sort of simplification that in the display eliminates these internal holes.
I think we have the following options
Leave as is.
Produce a JSON file and Shapefile for the user to download that is consistent with the display and provide that for the user to download. I do not actually know how to do this, as it would depend on what MMW is doing internally at (2) above.
Provide the ability for a user to download the non-generalized shapefile, that retains the complexity of the edges. This would have greater detail than was displayed.
I actually prefer 3. For the Mississippi the shapefile is 51 MB, which in today’s terms is very small and could be easily delivered as a download.
I think it is fine to leave what we have for the current release. If it is easy to do 3 I would push for doing it.
There may be a programming option where I could improve the hole removal by tracing around the outside perimeter of the shapefile, similar to some of the logic used in Scott Haag’s nested set approach, but that is a solution that would require programming.
Closing, since the resolutions proposed do not involve dissolving the holes.
Here is a patch of the beginning of this work (NHD working, DRB in progress). The solution involves reading the shapefile in via fiona and removing each polygon's internal rings via shapely — manipulating the shapefiles via fiona was part of an earlier approach replaced by the work here. I didn't find it to have too heavy of a performance toll, though more extensive testing would be needed to know for sure.
@dtarb this could be the basis for a future fix via RWD ^
This certainly looks promising. I implemented this patch in a test branch and it resolved the hole problem at the location
-111.923661 40.622373
However, it was unacceptably slow for the larger watersheds I test with
90.144049 29.954983
Mississippi This took about 10 min
-89.624598 36.243882
Upper mississippi This took 5 to 7 min
Recognizing that poor performance is a problem, here is one option towards a solution.
The first step above will need to be done outside of RWD, as the simplification is done outside of RWD (as the code is now structured).
The second step would only be needed if/when we implement full resolution downloads (see https://github.com/WikiWatershed/model-my-watershed/issues/2717). The time consuming full resolution hole removal would only run on a very small number of cases. There is only one test case I have where this occurs. There is a risk that the simplification process may remove holes and thus the approach of identifying watersheds with holes from the simplified shapes may not be 100% reliable. Given that this would be a hypothetical small fraction of a small fraction of cases, I would suggest not concerning ourselves with this at this point.
Additionally, hole removal is only needed for the NHD implementation. The Delaware implementation does not have any internally draining areas that are the cause of these holes, because data preparation for the Delaware used pit filling to remove sinks as part of preprocessing. (Although including it for completeness may be worthwhile in case there is ever new non pit filled data used in the Delaware.)
My work on this is in https://github.com/WikiWatershed/rapid-watershed-delineation/tree/fixHoles
Re-opening for future work, currently this is an enhancement.
Holes in a delineated watershed should be dissolved out. Description and test location here: https://github.com/WikiWatershed/model-my-watershed/issues/2703