Closed Robinlovelace closed 4 years ago
The new gradient_segment
column is in there now:
# Aim: test gradient calculations in CycleStreets
library(cyclestreets)
from = tmaptools::geocode_OSM("potternewton park")
to = tmaptools::geocode_OSM("university of leeds")
r = journey(from$coords, to$coords, cols = NULL, cols_extra = NULL)
mapview::mapview(r["gradient_segment"])
Created on 2020-04-17 by the reprex package (v0.3.0)
Problem: short segments still have an unwanted impact on the results, meaning they need to be smoothed to not distort the results. That can be a post-processing stage.
Heads-up @joeytalbot here's the result with the smoothing method in there:
# Aim: test gradient calculations in CycleStreets
library(cyclestreets)
from = tmaptools::geocode_OSM("potternewton park")
to = tmaptools::geocode_OSM("university of leeds")
r = journey(from$coords, to$coords, cols = NULL, cols_extra = NULL)
mapview::mapview(r["gradient_segment"])
# smooth unwanted high gradients
summary(r$distances)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 9.0 48.5 122.0 197.1 186.2 771.0
summary(r$gradient_segment)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.000000 0.007194 0.019018 0.026387 0.037812 0.111111
plot(r$distances, r$gradient_segment)
distance_cutoff = 20
gradient_cutoff = 0.1
sel = r$gradient_segment > 0.1 &
r$distances <= distance_cutoff
summary(sel)
#> Mode FALSE TRUE
#> logical 17 1
r$gradient_segment_smooth = stplanr::route_rolling_average(r$gradient_segment)
r$gradient_segment[sel]
#> [1] 0.1111111
r$gradient_segment_smooth[sel]
#> [1] 0.05731033
r$gradient_segment[sel] = r$gradient_segment_smooth[sel]
plot(r$distances, r$gradient_segment)
mapview::mapview(r["gradient_segment"])
Created on 2020-04-17 by the reprex package (v0.3.0)
There should already be smoothing at the API data result end. If you have specific cases where things seem odd, @si-the-pie would like to know.
Thanks @mvl22, it seems the vertical resolution of your data is 1m. If you have segments of 10m or less, obviously this will lead to some segments with high gradients due to rounding up or down. See what I mean? The smoothed result below matches with local knowledge, it's my cycle to work!
BTW it's only smoothing 1 anomalous section that has a distance of 9m. So it's not really smoothing, it's identifying and fixing anomalies.
Heads-up @mvl22 and @joeytalbot here is the resulting simple function in action. I plan to add this as an optional argument smooth_gradient
into the journey() function and return a new column to avoid data loss.
# Aim: test gradient calculations in CycleStreets
library(cyclestreets)
from = tmaptools::geocode_OSM("potternewton park")
to = tmaptools::geocode_OSM("university of leeds")
r = journey(from$coords, to$coords, cols = NULL, cols_extra = NULL)
mapview::mapview(r["gradient_segment"])
# smooth unwanted high gradients
summary(r$distances)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 9.0 48.5 122.0 197.1 186.2 771.0
summary(r$gradient_segment)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.000000 0.007194 0.019018 0.026387 0.037812 0.111111
plot(r$distances, r$gradient_segment)
distance_cutoff = 20
gradient_cutoff = 0.1
sel = r$gradient_segment > 0.1 &
r$distances <= distance_cutoff
summary(sel)
#> Mode FALSE TRUE
#> logical 17 1
r$gradient_segment_smooth = stplanr::route_rolling_average(r$gradient_segment)
r$gradient_segment[sel]
#> [1] 0.1111111
r$gradient_segment_smooth[sel]
#> [1] 0.05731033
# r$gradient_segment[sel] = r$gradient_segment_smooth[sel]
plot(r$distances, r$gradient_segment)
mapview::mapview(r["gradient_segment"])
smooth_with_cutoffs = function(
gradient_segment,
distances,
distance_cutoff = 20,
gradient_cutoff = 0.1
) {
sel = gradient_segment > 0.1 &
distances <= distance_cutoff
summary(sel)
gradient_segment_smooth = stplanr::route_rolling_average(gradient_segment)
gradient_segment[sel]
gradient_segment_smooth[sel]
gradient_segment[sel] = gradient_segment_smooth[sel]
gradient_segment
}
r$gradient_segment
#> [1] 0.051282051 0.015564202 0.005128205 0.007797271 0.006993007 0.057142857
#> [7] 0.029569892 0.111111111 0.031250000 0.030769231 0.022471910 0.000000000
#> [13] 0.014388489 0.040000000 0.040000000 0.011494253 0.000000000 0.000000000
smooth_with_cutoffs(gradient_segment = r$gradient_segment, distances = r$distances)
#> [1] 0.051282051 0.015564202 0.005128205 0.007797271 0.006993007 0.057142857
#> [7] 0.029569892 0.057310335 0.031250000 0.030769231 0.022471910 0.000000000
#> [13] 0.014388489 0.040000000 0.040000000 0.011494253 0.000000000 0.000000000
Created on 2020-04-17 by the reprex package (v0.3.0)
This is the line to watch
#> [7] 0.029569892 0.111111111 0.031250000 0.030769231 0.022471910 0.000000000
Heads-up @joeytalbot and @mvl22, latest version has the following example:
library(cyclestreets)
r7 = journey(c(-1.524, 53.819), c(-1.556, 53.806), smooth_gradient = TRUE)
plot(r7["gradient_segment"])
plot(r7["gradient_smooth"])
Created on 2020-04-17 by the reprex package (v0.3.0)
Yes the smoothing makes a definite improvement @Robinlovelace
I'd suggest renaming or adding comments explaining n
and lag
to make it clear that in these functions n
refers to distances and lag
refers to elevations.
With the existing approach, as I understand it, change in elevation across a segment is calculated as the difference between the mean elevation of that segment and the mean elevation of the following segment. This means the natural option to choose for every single segment (not just the ones with outlying gradients) is lag=1 and n=2, since lag = 1 already calculates change in mean elevation between that segment and the following one, so it is only right for the distance to be similarly calculated as the mean of the lengths of that segment and the following one.
n=2 and lag=1 are essentially both referring to the same thing (for distances and elevations respectively).
Of course if we adapt the way elevation is calculated, so the functions use 'difference between elevation at start of segment and elevation at end of segment', the meaning of lag
will change again.
I'd suggest renaming or adding comments explaining n and lag to make it clear that in these functions n refers to distances and lag refers to elevations.
The latest approach does not use n and lag. route_rolling_average uses a different function that has only one parameter: https://docs.ropensci.org/stplanr/reference/index.html#section-work-with-and-analyse-routes
I agree in general with the suggestion to change the parameter names. Pull requests to stplanr welcome, but don't think this is a priority.
HI, I'm chipping in the following comment, hope its useful, Simon
Elevation data in CycleStreets comes from a variety of surveys, but in GB from Ordnance Survey Terrain 50, where the resolution of the pixels is about 36 metres by 61 metres in Leeds. The elevation of the pixel covering the bit of Woodhouse Lane between Cookridge Street and Clay Pit lane is 57.9 metres:
gdallocationinfo -valonly osTerrain50_wgs84.tiff 14087 12905
# Output
57.9000015258789
Within the CycleStreets routing algorithm bilinear interpolation of elevations from the neighbouring four pixels is used, and there we go to millimetre level of accuracy, and this helps to smooth over the pixellation.
But elevation values in the output routes are only given rounded to the nearest metre. There are no plans to change this as adding that extra detail is not normally needed.
Thanks for the explanation. Sounds like you calculate gradients internally. Any plans to release a 'gradient' output? For now the estimates from the 1m outputs are fine for our needs.
No, there are no plans on gradient, but maybe we should because it is used to create a map tile layer in CycleStreets like the following.
The idea of this rendering is that flat streets are almost invisible. A gray scale is used to indicate increasing gradient. Flat areas are almost invisible, the bluey-gray becomes darker as the streets become steeper up to 5%. From 5% lines are shown as blue, from 10% as amber, and above 20% as red. Note that anything above 5% is quite steep for many cyclists.
Article that may be of general interest I came across.
CTC CycleDigest #58, 2009.
If you're happy to reconsider this, I think it could be better to smooth the gradients differently.
Given that the segments we're smoothing are <= 20m in length, a 1 or 2m change in elevation across a segment can produce an anomalously high gradient. The latest method takes the mean of the gradient within the segment itself and the gradient of the two adjoining segments. Therefore this can still be heavily influenced by the high calculated gradient of a very short segment.
Instead we could take the mean distance and mean elevation change across the three nearest segments, and use these to calculate the gradient. See below for how it would change the Leeds example.
library(cyclestreets)
from = tmaptools::geocode_OSM("potternewton park")
to = tmaptools::geocode_OSM("university of leeds")
r = journey(from$coords, to$coords, smooth_gradient = TRUE)
r = r %>%
mutate(elev_change = gradient_segment*distances,
m_dist = route_rolling_average(r$distances),
m_elev_c = route_rolling_average(r$elev_change)
)
r$elev_change
r$m_dist
r$m_elev_c
r = r %>%
mutate(gradient_smooth_method2 = ifelse(distances <= 20 & gradient_segment > 0.1, m_elev_c/m_dist, gradient_segment))
r$gradient_segment
r$gradient_smooth
r$smooth_gradient_method2
mapview::mapview(r["gradient_smooth"])
mapview::mapview(r["gradient_smooth_method2"])
And the existing method for comparison:
Agree, that looks better.
Re-opening this based on discussion here: https://github.com/Robinlovelace/cyclestreets/pull/17
I suggest we need an evidence-based assessment of gradient smoothing. We could add a function to stplanr
that calculates gradients per segment based on the linestring and a raster dataset.
Demo of how this works written by @Nowosad : https://geocompr.robinlovelace.net/geometric-operations.html#raster-extraction
An example of where something in our book that I'd never tested may be useful for our own research!
2m res data associated with study area:
An example where the routing goes over a big hill (I think) https://www.cyclestreets.net/journey/68884578/
In terms of getting all CS values in there, it's worth allowing the user to store the raw elevations I think, which could be extracted as follows:
library(tidyverse)
library(dplyr)
df <- tibble(
x = 1:3,
y = c("a", "d,e,f", "g,h")
)
df %>%
transform(y = strsplit(y, ",")) %>%
unnest(y)
#> # A tibble: 6 x 2
#> x y
#> <int> <chr>
#> 1 1 a
#> 2 2 d
#> 3 2 e
#> 4 2 f
#> 5 3 g
#> 6 3 h
Created on 2020-04-24 by the reprex package (v0.3.0)
Heads-up @joeytalbot and @mvl22 I've gone back to the drawing board with this and have written a function that can calculate the gradients directly from the point data. Because the elevations are only provided to the nearest metre we will still have to think about smoothing these results. I guess a weighted mean is the way to go on this, but comparisons with 'ground truth' data are definitely still needed. As it happens, the new function will also help generate ground truth data.
New function: https://docs.ropensci.org/stplanr/reference/route_slope_matrix.html
Another question, mainly for @joeytalbot and @mem48, is it worth splitting these new slope functions out into a new package, e.g. called slopes?
Cc @si-the-pie. Robin it’s best to cc him as well as me.
Sounds like you want an elevations=floats option.
This smoothing stuff should really be upstream in the API generally.
Sounds like you want an elevations=floats option.
True that, I think it will improve the utility of CS results.
Thanks for the response in the PR @Nowosad, realise that bilinear interpolation is needed. I think GDAL can extract raster values from XY values, right? Similar question: https://gis.stackexchange.com/questions/269603/extract-raster-values-from-point-using-gdal
That's what we're doing in the source processing that leads to the API output. You'll be doing this twice over.
Yes but it provides accurate gradient estimates that we need for estimating cycling potential. I've just cracked it and can now provide accurate gradients for any number of linestrings based on bilinear interpolation of a raster DEM. Not yet optimised but the code, which I plan to make available via stplanr
, looks a bit like this:
rg3d = function(x, elevation = NULL) {
m = st_coordinates(x)
x_sfc = st_sfc(x)
if(!is.null(elevation)) {
e = raster::extract(elevation, sf::st_sf(st_cast(x_sfc, "POINT")),
method = "bilinear")
} else {
e = x[, 3]
}
g1 = stplanr::route_slope_matrix(m, e = e, lonlat = FALSE)
d = stplanr::route_sequential_dist(m = m, lonlat = FALSE)
weighted.mean(abs(g1), d, na.rm = TRUE)
}
route_gradient_3dline = function(r, e = NULL) {
res_list =
if (requireNamespace("pbapply", quietly = TRUE)) {
pbapply::pblapply(sf::st_geometry(r), rg3d, e = e)
} else {
lapply(sf::st_geometry(r), rg3d, e = e)
}
unlist(res_list)
}
Have you explored shipping the data as 3d polylines? That is the way things are moving I think and makes extracting the gradient trivial.
I'm tracking progress on the general function here: https://github.com/ropensci/stplanr/issues/392
A package would be a good idea, I did a bunch of work on route profilling for optitruck working out slopes and radius of curvature that might be useful too.
Looks good @mem48 here is the function in stplanr that could also go in a new R package: https://docs.ropensci.org/stplanr/reference/route_slope_matrix.html
New package: https://itsleeds.github.io/slopes/
Update on this: This is now in there on the latest version (not yet on CRAN):
library(cyclestreets)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stplanr)
from = tmaptools::geocode_OSM("potternewton park")
to = tmaptools::geocode_OSM("university of leeds")
r = journey(from$coords, to$coords, smooth_gradient = TRUE)
names(r)
#> [1] "name" "distances" "time"
#> [4] "busynance" "elevations" "start_longitude"
#> [7] "start_latitude" "finish_longitude" "finish_latitude"
#> [10] "gradient_segment" "elevation_change" "provisionName"
#> [13] "quietness" "quietness_segment" "geometry"
#> [16] "gradient_smooth"
r = r %>%
mutate(elev_change = gradient_segment*distances,
m_dist = route_rolling_average(distances),
m_elev_c = route_rolling_average(elevation_change)
)
r$elev_change
#> [1] 8 0 12 1 4 1 6 22 1 5 2 6 0 2 1 3 1 0 0
r$m_dist
#> [1] NA 310.00000 324.00000 493.33333 283.66667 253.33333 330.33333
#> [8] 286.00000 304.00000 77.66667 163.66667 122.66667 147.33333 66.66667
#> [15] 79.66667 62.33333 59.00000 48.33333 NA
r$m_elev_c
#> [1] NA 6.6666667 4.3333333 5.6666667 2.0000000 3.6666667 9.6666667
#> [8] 9.6666667 9.3333333 2.6666667 4.3333333 2.6666667 2.6666667 1.0000000
#> [15] 2.0000000 1.6666667 1.3333333 0.3333333 NA
r = r %>%
mutate(gradient_smooth_method2 = ifelse(distances <= 20 & gradient_segment > 0.1, m_elev_c/m_dist, gradient_segment))
plot(r$gradient_segment, r$gradient_smooth)
r$gradient_smooth_method2
#> [1] 0.051948052 0.000000000 0.015564202 0.005102041 0.007797271 0.007042254
#> [7] 0.057142857 0.029569892 0.030701754 0.031446541 0.030769231 0.022471910
#> [13] 0.000000000 0.014388489 0.040000000 0.040000000 0.011494253 0.000000000
#> [19] 0.000000000
plot(r$gradient_smooth, r$gradient_smooth_method2)
Created on 2020-09-25 by the reprex package (v0.3.0)
Closing because smoothed gradients are returned by default now, as shown above. Not perfect but good enough for now.
Opening this issue as a place to properly document efforts to generate accurate gradient estimates from cyclestreets elvation data.