Road Segment Prioritization for Bicycle Infrastructure

Hussein Mahfouz 18/8/2020

This repository contains the code used for the dissertation of my MSc in Smart Cities and Urban Analytics at CASA UCL. Below is an explanation of the scripts used, and how the analysis can be reproduced.

Paper

The paper can be found in the repo, or through this link

Missing Data

There are a couple of files that cannot be synced to github due to their size. These files are neseccary for the scripts to run. Below are links to where you can download them, and instructions on where to place them in the repo file structure

Flow Data (2011 Census Origin-Destination Data):

Source: https://www.nomisweb.co.uk/census/2011/bulk/rOD1 —> Choose File “WU03EW”
Location in Repo: data-raw/flow_data.csv

Middle Layer Super Output Areas (December 2011) Boundaries:

Source: http://geoportal.statistics.gov.uk/datasets/826dc85fb600440889480f4d9dbb1a24_0
Location in Repo: data-raw/MSOA_2011_Boundaries/[Add files here]

Scripts

The scripts should be run in the order they are numbered in (and listed in here). The only exception is _x_dodgr_weighting_profiles.R.

__1.0_get_flow_data.R

This script matches MSOA to major towns and cities using data from here. It then matches the results to the Census flow data so that all OD pairs have an Origin City and Destination City (MSOAs in rural areas are not matched to a city)

In this script, you choose which city you wish to run the analysis from this list of available towns and cities:

##   [1] "Barnsley"             "Basildon"             "Basingstoke"         
##   [4] "Bath"                 "Bedford"              "Birkenhead"          
##   [7] "Birmingham"           "Blackburn"            "Blackpool"           
##  [10] "Bolton"               "Bournemouth"          "Bracknell"           
##  [13] "Bradford"             "Brighton and Hove"    "Bristol"             
##  [16] "Burnley"              "Burton upon Trent"    "Bury"                
##  [19] "Cambridge"            "Cardiff"              "Carlisle"            
##  [22] "Chatham"              "Chelmsford"           "Cheltenham"          
##  [25] "Chester"              "Chesterfield"         "Colchester"          
##  [28] "Coventry"             "Crawley"              "Darlington"          
##  [31] "Derby"                "Doncaster"            "Dudley"              
##  [34] "Eastbourne"           "Exeter"               "Gateshead"           
##  [37] "Gillingham"           "Gloucester"           "Grimsby"             
##  [40] "Guildford"            "Halifax"              "Harlow"              
##  [43] "Harrogate"            "Hartlepool"           "Hastings"            
##  [46] "Hemel Hempstead"      "High Wycombe"         "Huddersfield"        
##  [49] "Ipswich"              "Kingston upon Hull"   "Leeds"               
##  [52] "Leicester"            "Lincoln"              "Liverpool"           
##  [55] "London"               "Luton"                "Maidstone"           
##  [58] "Manchester"           "Mansfield"            "Middlesbrough"       
##  [61] "Milton Keynes"        "Newcastle upon Tyne"  "Newcastle-under-Lyme"
##  [64] "Newport"              "Northampton"          "Norwich"             
##  [67] "Nottingham"           "Nuneaton"             "Oldham"              
##  [70] "Oxford"               "Peterborough"         "Plymouth"            
##  [73] "Poole"                "Portsmouth"           "Preston"             
##  [76] "Reading"              "Redditch"             "Rochdale"            
##  [79] "Rotherham"            "Salford"              "Scunthorpe"          
##  [82] "Sheffield"            "Shrewsbury"           "Slough"              
##  [85] "Solihull"             "South Shields"        "Southampton"         
##  [88] "Southend-on-Sea"      "Southport"            "St Albans"           
##  [91] "St Helens"            "Stevenage"            "Stockport"           
##  [94] "Stockton-on-Tees"     "Stoke-on-Trent"       "Sunderland"          
##  [97] "Sutton Coldfield"     "Swansea"              "Swindon"             
## [100] "Telford"              "Wakefield"            "Walsall"             
## [103] "Warrington"           "Watford"              "West Bromwich"       
## [106] "Weston-Super-Mare"    "Wigan"                "Woking"              
## [109] "Wolverhampton"        "Worcester"            "Worthing"            
## [112] "York"

This is done in line 17. For example:

chosen_city <- "Manchester"

The script then filters all flow data where both the Origin MSOA AND the destination MSOA are in the chosen city

If you wish to run the analysis on London, then make sure you have a computer that is up to the task. I didn’t :(

__2.0_distance_and_elevation.R

This script is used to get the distance and slope between each OD pair

Distance: The routed distance using the dodgr package.
Slope: The average slope along the route seperating the OD pair, using the slopes package

__3.0_potential_demand.R

This script is used to estimate where additional cycling demand will come from. Let’s say that the target for Manchester is a 10% increase in cyling mode share, how many additional cyclists do we assign to each OD pair to reach that target mode share

Predict Probability of Cycling Between Each OD Pair Based On Geography
1. A glm is used to predict probability of cycling based on distance and slope
Accounting for Existing Mode Share
1. Look at performance of each OD pair and assign additional cyclists accordingly. OD pairs that have a low cycling mode share are allocated more cyclists than OD pairs that already have a high cycling mode share. This is because OD pairs with low cycling mode share have more potential (latent demand) than OD pairs with high cycling mode share
Scaling Results To Match Mode Share Target
1. Specify target mode share increase (default is 10% but this is unreasonable for a city like Cambridge that already has a cycling mode share of 40%)
2. Scale potential cycling demand up or down so that it matches the target % increase

__3.1_plot_mode_shares.R , __3.2_plot_od_comparison.R_ , __3.3_plot_desire_lines_current_vs_potential.R_

These three scripts plot the results of __3.0_potential_demand.R .

Compare the distance distribution of existing cycling mode share and potential cycling mode share:

Vizualize Existing and Potential Cycling Flow as Desire Lines

desire lines

Examine where potential cycling demand is assigned.

The methodology in __3.0_potential_demand.R insures that OD pairs that have a low cycling mode share are allocated more cyclists than OD pairs that already have a high cycling mode share. In the figure below, the x axis is a ratio of the cycling mode share of the OD pair to its expected cycling mode share. The expected cycling mode share is obtained from a glm where distance, sqrt(distance), and slope are used as predictors. Looking at the resulting cycling mode share, we see that OD pairs between 2-8km have the highest mode share (consistent with bell-shaped distribution of cycling vs distance), and that mode share increase is highest for OD pairs that have lower than expected cycling mode shares.

__x_dodgr_weighting_profiles.R

The dodgr package is used to route the cycling demand (flow) onto the road network. This is done using different weighting profiles, as explained in the documentation of the package. This script is used to download a json file of the weight profile and edit the ‘bicycle’ entries. Weights are assigned to all OSM road types (for example, we assign a weight of 0 to make sure that no cycling routes utilize them). The weighting profiles used are explained in the methodology.

The weighting profiles used in the analysis are in the data file of the repo. These are: * weight_profile_unweighted.json: unweighted shortest paths * weight_profile_weighted.json: weighted shortest paths (weighting profile explained in methodology) * weight_profile_no_primary_trunk.json: weighted shortest paths with cycling banned on primary and trunk roads

These weighting profiles are used in script __4.0_aggregating_flows.R

__4.0_aggregating_flows.R

This script uses the The dodgr package is used to route potential cycling demand onto the road network. This is done for the different weighting profiles used in the analysis

__5.0_identifying_cycle_infastructure_from_osm_tags.R

This script is used to identify all road segments that have segregated cycling lanes. This includes all roads that match any of the 3 following tags:

highway = cycleway
cycleway = track
bicycle = designated

__6.0_comparing_weighting_profiles.R

Here we analyze the street network configuration of the city by comparing the unweighted shortest paths to the weighted shortest paths (check methodology for explanation of weighted shortest paths). The aggregated flow shows us which road types are used, and it is clear that cycleways are not utilized unless the road network is weighted to create a hierarchy of road type preference.

Unweighted
Routing

Weighted Routing

__7.0_community_detection.R

Using the potential cycling demand between OD pairs, we are able to define communities in the network. The nodes are the population-weighted MSOA centroids (location obtained from the pct package) and the links between them are weighted by the potential cycling demand between them. The Louvian algorithm is used to assign each MSOA centroid to a community, and then each road segment on the network is assigned to the same community as the MSOA centroid closest to it. The results for Manchester are shown below

Community
Detection

__8.0_growing_a_network.R

This script contains all the functions for prioritizing road segments for dedicated infrastructure. It is necessary to run this script before 8.1 and 8.2. The speed of the functions is inversely proportional to the size of the city being analyzed (Script 8.2 takes almost 2 hours for Birmingham on a 2.7 GHz Intel Core i5 laptop with 8GB of RAM)

__8.1_plot_network_growth.R

Here we obtain results for the utilitarian growth functions (Algorithms 1 and 2 in the paper)

Algorithm 1: Growth from One Origin

Logic:

Identify link with highest flow and use it as a starting point for the solution
Identify all links that neighbor links in the current solution
Select neighboring link with highest flow and add it to the solution
Repeat steps 2 & 3 until all flow is satisfied or investment threshold is met

Results:

Algorithm 2 (Utilitarian Growth)

Logic:

Identify all links that have dedicated cycling infrastructure and add them to the initialsolution
Identify all links that neighbor links in the current solution
Select neighboring link with highest flow and add it to the solution
Repeat steps 2 & 3 until all flow is satisfied or investment threshold is met

Results:

The results show the priority of each road segment (Roads are grouped into 100km groups for vizualization purposes)

__8.2_plot_network_growth_community.R

Here we obtain results for the egalitarian growth function (Algorithms 3 in the paper). We also compare the connectivity of the network proposed by both algorithms

Algorithm 3 (Egalitarian Growth)

Logic:

Identify all links that have dedicated cycling infrastructure and add them to the initial solution
Identify all links that neighbor links in the current solution
Select from each community one neighboring link with highest flow and add it to thesolution
If there are no more neighboring links in a community, select the link with the highest flow in the community, regardless of connectivity, and add it to the solution
Repeat steps 2, 3 & 4 until all flow is satisfied or investment threshold is met

Results:

Priority of each road segment, and utilization of different OSM road types:

Comparing Connectivity of Algorithm 2 and 3

We check the number of connected components and the size of the Largest Connected Component as road segments are added to the solution (the components are the road segments). The initial number of components depends on the existing bicycle network of the city. For Manchester, we can see that the existing bicycle network has over 120 disconnected components (Remember we are only looking at segregated bicycle infrastructure, not painted bicycle lanes).

The algorithms seem to provide comparable connectivity gains.

Hussein-Mahfouz / Bicycle-Network-Optimization

readme

Road Segment Prioritization for Bicycle Infrastructure

Paper

Missing Data

Scripts

__1.0_get_flow_data.R

__2.0_distance_and_elevation.R

__3.0_potential_demand.R

__3.1_plot_mode_shares.R , __3.2_plot_od_comparison.R_ , __3.3_plot_desire_lines_current_vs_potential.R_

__x_dodgr_weighting_profiles.R

__4.0_aggregating_flows.R

__5.0_identifying_cycle_infastructure_from_osm_tags.R

__6.0_comparing_weighting_profiles.R

__7.0_community_detection.R

__8.0_growing_a_network.R

__8.1_plot_network_growth.R

Algorithm 1: Growth from One Origin

Algorithm 2 (Utilitarian Growth)

__8.2_plot_network_growth_community.R

Algorithm 3 (Egalitarian Growth)

Comparing Connectivity of Algorithm 2 and 3