gravitystorm / openstreetmap-carto

A general-purpose OpenStreetMap mapnik style, in CartoCSS
Other
1.52k stars 813 forks source link

Consider "width" for rendering waterways if specified #3795

Open RicozOSM opened 5 years ago

RicozOSM commented 5 years ago

Surprised I can't find any issue for this.. sorry if I missed it in my search.

Expected behavior:

matkoniecz commented 5 years ago

See #1853 for similar work that ended refused as requiring a very complex code without significant gain justifying this, though it was in 2016 and it was about applying it to a different feature.

imagico commented 5 years ago

In addition to what was discussed in #1853 there is the added complexity of processing the waterways and water polygons in combination.

Adamant36 commented 5 years ago

Would it possible to get CartoCSS and Mapnik to support it so we wouldn't have to use the complex code? Or is that dead in the water, considering their lack of development?

jeisenbe commented 5 years ago

Analysis of Waterway Width

I've used overpass-turbo.eu to download all of the combinations of waterway= with width= from <1m to 15 meters, plus samples of greater widths, and made a spreadsheet of the data:

Raw Data

Zip with .csv files waterway-width-analysis.zip

Zip with .xlsx file including calculations and charts Waterways-width.xlsx.zip

Percentages of each type of feature below a certain width

This is probably the most important table. It shows what percentage of each feature is tagged with a with less than a certain number of meters.

           
Width stream river drain ditch canal
Number with width 74652 24830 13933 11012 15907
% < 1 m 14% 0.4% 38% 20% 2.8%
% </= 1m 42% 3.4% 65% 59% 13%
% < 2 m 46% 5.5% 74% 63% 15%
% </= 2m 81% 12% 83% 84% 29%
% < 3 m 82% 15% 84% 85% 30%
% < 5 m 92% 36% 91% 93% 52%
% </= 5m 94% 45% 93% 95% 61%
% < 10 m 96% 55% 94% 97% 74%
% width >/= 10 m 4.5% 45% 5.8% 3.1% 26%

In bold are the points where a majority of features of that type are included; eg a over 2/3rds of waterway=stream features with a width=* tag are less than or equal to 2 meters in width, while almost half of rivers are greater than or equal to 10 meters in width.

In italics is the 90th percentile, both at the larger and smaller end, when available.

Note that the over 10% of streams, drains and ditches are less than 1 meters in width (0.9* and less), and in fact 38% of drains, and 20% of ditches, are tagged with this small of a width!

In contrast, only 5.5% of rivers are less than 2 meters in width, and less than 13% of canals are 1 meters width or less.

However, in contrast to rivers, there are a small but significant number of canals with 1 meter width and even almost 3% with sub-meter width tagging.

On the larger end, streams, ditches and waterways all have 91% to 93% of widths less than 5 meters (I've checked some of small number of ditches and drains tagged with width > 10, and many appear to be mistakes where the unit is in centimeters or perhaps inches rather than meters)

Both canals and rivers are commonly found in much larger widths: 26% of canals and 45% of rivers with a width tag are wider than 10 meters.

So it appears that ditches and drains are features with a similar size to waterway=stream features, although on the small end there are many more sub-meter width drains and ditches with a width tag than streams (probably since it is hard to measure the width of a natural stream to decimeter precision!)

But canals are not entirely like rivers, because while rivers are never used for tiny waterways, there are some tiny artificial waterways tagged as canals

Compare these two charts:

Natural-Waterways-100-bars

Artificial-waterway-100-percent

On the one hand, mappers are more consistent in tagging a 10 meter wide artificial waterway as a canal instead of a ditch or drain - at this width there are still a surprisingly high number of streams instead of rivers (even though the world long jump record is less than 9 meters).

But on the other end of the scale, there are still a number of tiny canals.

Conclusions:

This data suggests that

This suggests that canals would benefit most from a width-specific rendering, perhaps with ranges of: - < 1 m (renders very thin, same width as current drains)

Also see this page I've made on the openstreetmap.org wiki: https://wiki.openstreetmap.org/wiki/User:Jeisenbe/Waterway_Widths

jeisenbe commented 5 years ago

I forgot to mention irrigation. I specifically checked the features tagged with irrigation=yes, service=irrigation or usage=irrigation - the tagging is not yet well established.

There are 10519 waterway=canal, 3673 waterway=ditch and 1300 waterway=drain with one of these tags, but only a small percentage also have a width= tag:

1241 waterway=canal, 140 waterway=ditch and only 3 waterway=drain

Unsurprisingly, with those topline numbers there are more waterway=canal at each width, though waterway=ditch gets close at the smaller widths. Considering that the ration is 3:1 for all canal:ditches with irrigation tags, there may be nearly equal numbers of narrow ditch and canalfeatures, were they tagged consistently:

Irrigation-canal-ditch-with-width-100-totals

Irrigation-canals-vs-ditches-cummulative

Certainly it looks like many mappers consider waterway=ditch appropriate for use with irrigation channels, and not only for drainage ditches.

But waterway=drain is clearly a very uncommon tag for irrigation features.

matkoniecz commented 5 years ago

Percentages of each type of feature below a certain width

Thanks for making this! Is code used to do this published somewhere? Also, is it complicated to check whatever counting length rather than feature count would change results? Usually counting elements rather than length gives massive boost to micromapped areas.

jeisenbe commented 5 years ago

Is code used to do this published somewhere?

I did it the slow, hard way: I checked the count in overpass-turbo for each width value.

Eg to find the number of waterway=canal + width=<1 + irrigation: https://overpass-turbo.eu/s/JAu

And to find waterway=canal + width=1.*: https://overpass-turbo.eu/s/JAq

[timeout:25];
(
  way["waterway"="canal"]["width"~"^1\\."];
  );
out count;

I entered each value from "." and "0." to 11.* in a spreadsheet, then checked the integer numbers up to 15, then every 5 meters up to 30, plus 50 and 100

I intendt to describe the methods at https://wiki.openstreetmap.org/wiki/User:Jeisenbe/Waterway_Widths but I had to take a break to cook dinner.

Also, is it complicated to check whatever counting length rather than feature count would change results?

This would be better, but it's still just a small subset, because only 5% of the features have a "width" tag. Eg, there are 349, 761 waterway=canal ways according to TagInfo, but only 15,907 have a width tag.

I'm also not sure how to check the length of a feature with the Overpass API. I only know how to check the length of ways by downloading the data and checking in JOSM, since I don't have a full planet database.

jeisenbe commented 5 years ago

Here's a couple more charts which go along with the table above, showing the percentage of each type of feature that is below a certain width:

Natural waterways (streams and rivers): Cummulative-percent-natural-waterways-by-width-simple

Artificial waterway (ditches, drains, and canals): Cummulative-artificial-waterways-simpler

imagico commented 5 years ago

:-1: on making the assumption that the waterways with a width tag are representative for all waterways.

You should also keep in mind that in the past there were imports of waterway tags done with a width tag which might not always derive from a quantitative measurement but from a broad classification system so could significantly distort distributions of values.

From the quick look at the width values of waterway=canal i had in https://github.com/gravitystorm/openstreetmap-carto/issues/3354#issuecomment-496533776 it seems that the distribution of width values for waterway=canal has two maxima - one at 2-3m and another one at 10m with the minimum at 7m. If the distribution is not distorted by imports or systematic width tagging efforts that is likely to represent use of waterway=canal for large, typically navigable canals on one hand and smaller features on the other hand. Concluding from this that use of waterway=canal in general follows the same pattern is very likely wrong though - in other words: mappers seem to more often add a width tag to small canals than to larger one. The width tagging not being representative is very natural considering how small the fraction of waterways with a width tag is overall.

Regarding the possibility that ground unit rendering is added as a feature to Mapnik - that seems very unlikely. True innovation is hard and tends to be more likely in new, experimental software than in old legacy stuff.

jeisenbe commented 5 years ago

"You should also keep in mind that in the past there were imports of waterway tags done with a width tag which might not always derive from a quantitative measurement but from a broad classification system so could significantly distort distributions of values."

This might be the case, but if you download the .csv or .xlsx files you can see the distribution of the data.

My guess is that a significant number of the drains and ditches with decimal-place accuracy are imported, because I'm skeptical that many individual mappers are measuring these features to sub-meter precision. There are a large number of drains and ditches <1 m wide with such precision, but otherwise the integer numbers are much more common. Also, the "round numbers" ending in an even number, 0 and 5 are strongly preferred, suggesting that individual mappers are estimating to the nearest couple of meters.

Here's the full data table for waterway=canal:

Tags Width Count % of type % of width
waterway=canal & width total 15907   39%
waterway=canal & width .* 8 0.05% 30%
waterway= canal & width 0.* 443 2.78% 6%
waterway= canal & width 1 1605 10.09% 17%
waterway= canal & width .1* 338 2.12% 16%
waterway= canal & width 2 2266 14.25% 39%
waterway= canal & width .2* 105 0.66% 33%
waterway= canal & width 3 2116 13.30% 63%
waterway= canal & width .3* 110 0.69% 57%
waterway= canal & width 4 1248 7.85% 71%
waterway= canal & width .4* 32 0.20% 78%
waterway= canal & width 5 1488 9.35% 75%
waterway= canal & width .5* 142 0.89% 92%
waterway= canal & width 6 600 3.77% 77%
waterway= canal & width .6* 30 0.19% 83%
waterway= canal & width 7 361 2.27% 90%
waterway= canal & width .7* 21 0.13% 75%
waterway= canal & width 8 681 4.28% 85%
waterway= canal & width .8* 19 0.12% 90%
waterway= canal & width 9 130 0.82% 95%
waterway= canal & width .9* 20 0.13% 100%
waterway= canal & width 10 1364 8.57% 91%
waterway=canal & width 10.0* 4 0.03% 100%
waterway= canal & width 11 53 0.33% 96%
waterway=canal & width 11.0* 17 0.11% 100%
waterway= canal & width 12 290 1.82% 84%
waterway= canal & width 13 68 0.43% 94%
waterway= canal & width 14 79 0.50% 99%
waterway= canal & width 15 537 3.38% 93%
waterway= canal & width 20 245 1.54% 96%
waterway= canal & width 25 68 0.43% 91%
waterway= canal & width 30 124 0.78% 81%
waterway= canal & width 50 109 0.69% 58%
waterway= canal & width 100 16 0.10% 35%
Check total of checked values 14737    
         
Canal with width   15907    
% of all with width   93%    
Percentage < 1 m   2.8%    
% </= 1m   13%    
Percentage < 2 m   15%    
% </= 2m   29%    
Percentage < 3 m   30%    
Percentage < 5 m   52%    
% </= 5m   61%    
Percentage < 10 m   74%    
% width >/= 10 m   26.1%    
         
Canals        
Special: % >/= 1m   97.2%    
Special: % >/= 5m   48.0%    

It's appears that if a canal or river is 4 to 6 meters wide, it gets measured as width=5 most of the time. Similarly, 9 and 11 meter wide canals are tagged width=11. I'm a little surprised at there are so many canals and rivers with width=8 width=12, but perhaps this number is considered "round" enough?

Chart of "Percentage of type" column for all waterways (This is the percentage of each waterway type with a width tag that has the specific width):

Note how the numbers for river and canal are very close for all values over 4 meters. So the patterns are due to how mappers estimate the width to the nearest 1 or 2 meters (from 4 to 15 meters) or the nearest 5 meters (for values over 15): Waterways-3m-and-up-percentage-all

For completeness, here is all the data, including . and 0. meters (which are dominated by ditches and drains): All-waterways-all-widths-percentages

jeisenbe commented 5 years ago

If we assume that mappers are rounding to the nearest number that ends with 5 or 10 from 4m to 15m, and then to the nearest 10 meters after that, we can get this table and chart, where the rivers and canals follow a reasonable distribution with a single maximum at 2 to 10 meters:

Width Stream River Drain Ditch Canal
<1 14% 0.4% 38% 20% 2.8%
1* m 32% 5.1% 36% 43% 12%
2* m 36% 9.4% 11% 21% 15%
3* m 7.3% 14% 5.0% 5.7% 14%
4-7m 5.3% 22% 4.4% 5.9% 25%
8-11m 1.6% 13.9% 0.75% 1.43% 14.4%
11-14m 0.2% 3.3% 0.24% 0.26% 3.2%
15m 0.1% 3.7% 0.14% 0.18% 3.4%
20m 0.1% 5.7% 0.06% 0.03% 1.5%
30m 0.0% 3.5% 0.14% 0.09% 0.78%

all-waterways-categorized-widths

So the apparent maxima at 2-3m, 5m and 10m is an artifact of mappers estimating many canals as "approximately 5 m" or "approximately 10 m".

Most likely the number of canals and rivers decreases gradually after a maximum at 2 to 3 meters (this is somewhat distorted by the non-linear x-axis in the chart above).

Overall the largest number of waterways are 1 m or less, steadily decreasing as width increases, as would be expected from how waterway networks are formed.

kocio-pl commented 5 years ago

Just some short remarks:

imagico commented 5 years ago

Thanks for the more elaborate numbers. These match my quick analysis quite closely. Note you can get a quick list of values with something like

grep "k=\"width\"" canals.osm | sed "s?.*v=\"??" | cut -d "\"" -f 1 | sort -n | uniq -c

once you have an OSM file with only the ways you want to look at.

It's appears that if a canal or river is 4 to 6 meters wide, it gets measured as width=5 most of the time. Similarly, 9 and 11 meter wide canals are tagged width=11. I'm a little surprised at there are so many canals and rivers with width=8 width=12, but perhaps this number is considered "round" enough?

This is based on the assumptions that

Another reason why mappers might tag different width values with different likeliness could be our rendering rules. It seems a plausible hypothesis that mappers in particular tag width when the line width of the waterway in the map is off at certain zoom levels - as we have discussed for canals. The fact that canals of all waterway types have the largest fraction of features with a width tag speaks for this in a way.

Anyway - speculating in detail about the reasons for the existing distribution of values does not really bring us forwards. That the current drawing width of waterway=canal is not suitable for the full range of features the tag is used for is quite clear. We have so far three suggestions how to address this:

jeisenbe commented 5 years ago

adjust the drawing width of waterways ... based on the specific width value tagged: ... this would be highly misleading and counterproductive especially at high zoom levels if not done based on ground units

I don't think it's a great idea to adjust the canal rendering precisely to the tagged width. Since most types of waterways have examples from 0.5m to 50 m, a rendering that mapped the aerial imagery at z19 would require a huge range of widths.

add a sub-classification for canals based on width tag or other tags either combining the small canals with the other small artificial waterways or creating a third class

I think this is the more reasonable idea. As you've suggested, the small canals seem to be a "different beast" than the large ones. I'd considered if OSM might even need a new tag for waterway=minor_canal (or something like waterway=aqueduct), but that's out of scope for this discussion.

A reasonable idea would be to render stream-sized canals (1 to 3 meters wide) at a thinner width than the average canal. We could also use the idea from #3354 to render untagged canals a little thinner too.

Ideally, it would be nice to have a subtle difference between canals, ditches and drains for mapper feedback at the higher zoom levels, if this can be managed

If this first PR works out, I would also be interested in introducing a thinner rendering for ditches, drains and canals under 1 meter in width: these can be stepped over, rather than jumped, so they are not nearly as significant to navigation as a 2 meter wide waterway.

And perhaps even very thin streams could be shown thinner (though most of these are intermittent=yes which already is distinguished, this depends on climate).

But I'm trying to feel out if some test code and images of this concept is worthwhile.

imagico commented 5 years ago

Since most types of waterways have examples from 0.5m to 50 m, a rendering that mapped the aerial imagery at z19 would require a huge range of widths.

Yes, that would be the general idea of this approach. So far this style has no specific design concept for the very high zoom levels beyond the extrapolation of what is being done at the lower zoom levels. There are very different cartographic approaches you could take at these scales and drawing waterways in their physical width is definitely a possibility. If we can and should do that given the constraints we have is a different question of course.

RicozOSM commented 5 years ago

Thanks Jeisenbee for your analysis. One method to "postprocess" overpass queries that I use frequently is to analyze them in JOSM - the search function there allows numeric comaprison, easy counting of way segments and downloading context information.

RicozOSM commented 5 years ago

@kocio-pl: regarding the popularity of width I suspect that missing renderer support for it also plays a role. Also agree that tagging the area is a very good alternative for rivers and streams but a considerable percentage of man-straightened rivers and streams which have exactly constant width over considerable length. Also many smaller rivers and streams would get at least halfway decent rendering in places where good aerial imagery isn't available.

Adamant36 commented 5 years ago

I don't know if rendering areas for rivers is always a solid thing to do because of erosion. Its impossible to tell what the true area of something is when its always shifting. There should really be something like an "outer bank" tag and the area should be reserved for the true middle of the river, but even that can change. Its not as easy to adjust once it does if its mapped as an area either.

RicozOSM commented 5 years ago

@Adamant36: not only erosion but some mountain streams look drastically different depending on season. I am working on a riverbed draft - https://wiki.openstreetmap.org/wiki/User:RicoZ/waterway%3Driverbed - but in some cases we must simply acknowledge that we can only map such features with very limited accuracy. Width won't be any better in such cases but also not worse than painting a fake geometry. Hm.. how about something like width=6-17m ?

RicozOSM commented 5 years ago

Apparently JOSM has it fixed since many years ago - https://josm.openstreetmap.de/ticket/6458

Adamant36 commented 5 years ago

not only erosion but some mountain streams look drastically different depending on season. I am working on a riverbed draft

Good point. I'll check out the draft. There's a big need for a something like a riverbed tag. In America NHD imports use a wash (I guess they are called arroyos in other places) tag to designate the river area of a riverbed. I don't know if it's exactly the correct usage though. But "riverbed" gives the impression it only applies to rivers. When streams can sometimes have the same problem. A riverbed tag would still be an improvement though. Oh yeah, there was a discussion a while back about using wadi for this. I can't remember where now though. Anyway, I don't want to get into a tagging discussion. Is there a draft page for it somewhere? I couldn't find one on the page or in the search. Maybe I just missed it.

in some cases we must simply acknowledge that we can only map such features with very limited accuracy. Width won't be any better in such cases but also not worse than painting a fake geometry.

I agree with that. Neither one is a great solution. Does anyone know if the ruler on the website is an accurate way to determine width? If not, encouraging people to use it by having width rendering might not be a good idea. I'm not sure how people are supposed to determine a waterways width any other way.

RicozOSM commented 5 years ago

@Adamant36: the draft is "hidden" in my userspace - https://wiki.openstreetmap.org/wiki/User:RicoZ/waterway%3Driverbed . Getting that definition right is not trivial.

jragusa commented 5 years ago

The erosion of the river bed is not a good thing to consider. The path of a meandering river is evolving though time due to a combination of deposition/erosion stage in each meander. It's the same problem with braided river. The geographic distribution of sand bars is also moving through time especially after a flood and then affect the path of the river. And I'm not talking about anastomosed rivers in very flat area which can create easily new channels. So considering the erosion to assess the width of a river is an endless problem, especially in alluvial plains.