Closed helfy18 closed 2 years ago
This script is returning results different than I expected - every watershed has the same mouth. I'm not sure if this is because
I'm hoping it's the second option, otherwise we won't be able to build a connection between the Freshwater Atlas polygons and our flow model this way. I need to look into this farther.
I do have a small change to request now: could you please change the third column of the CSV to have the heading WTRSHDGRPC
and the value for just that attribute, rather than the entire set of properties values?
@corviday I just changed it to only return the code.
Here's my thinking behind using centroid()
. With graph paper!
Say we have two watersheds that, in the actual world, look like this. If we had perfect and complete knowledge and no computation limitations, we'd have this extra-squiggly data:
We don't have perfect information; we have two sets of imperfect information, and we're trying to translate from one to the other. The first imperfect description of the watershed is a set of polygons from the BC Freshwater Atlas. These were pretty high resolution once, but I've had to lower their resolution because they were computationally implausible. Let's say the Frreshwater Atlas says the watersheds look like this, described as polygons:
Separately, we also have a imperfect modeled description of the watershed used by PCIC to calculate how water moves across the ground. In this description, land is divided up into grid cells, and water from every grid cell flows downstream to one other grid cell. We can think of the "watershed" as a particular grid cell (the outlet or mouth), plus all the other grid squares who, if we follow the water that rains onto them, eventually that water would pass through the outlet. I've drawn a circle where the mouth of each watershed is, plus outlined the grid cells PCIC understands to be part of this watershed. (And also drawn in the arrows)
So what we're trying to solve with this script is to be able to translate from the Freshwater Atlas polygons in the first diagram to the PCIC gridded watersheds in the second diagram. The way we're doing this is:
Here are the two different imperfect representations of the watershed superimposed. They generally agree about where the watersheds are, but neither of them is perfect, and they have simplified the map in different ways, so they do disagree a little on which points are in which watersheds. I've coloured the points they don't agree about green.
What happens if we pick one of the disputed green points? Suppose we're trying to determine the translation for Watershed A in the Freshwater Atlas into the PCIC model. We pick a point0
that is inside Watershed A according to the Freshwater Atlas, but it happens to be inside Watershed B according to the PCIC model. We begin trying to trace the water flowing through our point downstream - it flows deeper into Watershed B, and reaches point1
.
Now we check and see that point1
is outside Watershed A in the Freshwater Atlas. So we're done tracing the water flow to the mouth. The mouth of Watershed A is set to point0
. This is an incorrect answer.
I think all the disputed points will be around the edges of the watershed; I wanted to use the centroid to reduce the chances of this happening.
That was my logic for using the centroid.
Though I didn't think of the fact that for some watersheds the centroid will be outside the watershed entirely, and we'll have the exact same problem. Maybe check if the centroid is outside the watershed, and if it is, use representative_point()
?
That said, this may not matter in the real world; @helfy18 reports that all the same mouths were calculated using centroid()
and representative_point()
.
We could perhaps use representative_point()
for all watersheds but output an error message if point0
is chosen as the mouth - we shouldn't have any single grid cell watersheds in this dataset, so that would be a good indicator something has gone wrong.
That is a truly awesome exposition!
I think I have a more robust solution:
shapely.object.buffer
with a negative distance
parameter).representative_point()
of the eroded polygon. Given that the eroded polygon is inside the flow matrix watershed, this has to be too.Oh hey, yeah! That sounds perfect! Avoids the dodgy edges of the watershed and the issues of the U-shaped watersheds like the Lower Halfway River!
Details on using the suggested solution. Instead of using
center = polygon.centroid
use
center = polygon.buffer(distance).representative_point()
where distance is a negative value whose magnitude somewhat exceeds the (physical distance) size of a flow grid cell.
I ran the new version, and it does work better. We now have all 20 watersheds that were being displayed with centroid
, and we also have it so that the center point is always contained within the watershed (previously WTRSHDGRPD 171 was having a center point that was not contained in the watershed)
Add a script that takes outputs the mouths of watersheds used by the web app by using the flow direction from a netCDF file. Resolves #58