ATFutures / upthat

Urban planning and transport health assessment tool
4 stars 1 forks source link

Spatial interaction approach #49

Open Robinlovelace opened 4 years ago

Robinlovelace commented 4 years ago

The latest version of the reproducible spatial interaction model (SIM) testing script, using only the Euclidean distance of the desire line and distance of origin and destination from the city as predictors, all of which can be calculated for any city when divided into an appropriate zoning system (more on that soon in relation to work with @mtennekes), can explain ~50% of the variability in commute numbers at the OD level based on example data from Bristol. Results of the actual data (left), unconstrained SIM (middle) and 'production constrained' SIM (right) shown below.

image

We can estimate mode shift under different scenarios at the OD level. The question is, how to integrate this with the SNA approach? Calculating the route for each OD pair for each mode of transport could give an indication of physical activity and, provided an air pollution exposure layer is available, air pollution, for all travellers travelling along each desire line.

An idea I'm keen on is treating 'working from home' as a mode of travel, with resulting reduction in energy use and exposure for long trips that are more likely to switch to telecommuting, although appreciate that at present this is more relevant for high income nations at the moment (although that may change).

Heads-up @mpadge and @thiagoherickdesa - this is work in progress but runs super fast, is easy to generalise, has minimal data requirements and should generate plausible, testable, geographically differentiated estimates of health impacts under a range of scenarios. My aim is to generate results for Bristol, Accra and (for sanity checking) Leeds.

Robinlovelace commented 4 years ago

Update: here are the model results from a simple linear model based on a small dataset for Bristol. I envisage adding more predictors such as number of shops in the destination zone and number of buildings in the origin zones will greatly improve model fit for journeys. The mode split component can be tackled by the Dirichlet regression approach outlined in the scenarios section of the adaptation manual.

summary(m9)

Call:
lm(formula = all ~ log(distance) + log(distance_to_centre) + 
    distance * distance_to_centre + log(distance) * log(distance_to_centre) + 
    distance_to_centre * distance_to_centre_o, data = l, weights = all)

Weighted Residuals:
    Min      1Q  Median      3Q     Max 
-7203.0  -604.8  -316.0    45.1 18924.4 

Coefficients:
                                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                              4.176e+03  6.430e+01  64.943  < 2e-16 ***
log(distance)                           -1.559e+03  4.165e+01 -37.422  < 2e-16 ***
log(distance_to_centre)                 -5.363e+02  9.212e+00 -58.213  < 2e-16 ***
distance                                 7.034e+00  2.689e+00   2.616  0.00895 ** 
distance_to_centre                       8.716e-02  2.440e-03  35.725  < 2e-16 ***
distance_to_centre_o                     2.518e-02  1.453e-03  17.324  < 2e-16 ***
distance:distance_to_centre             -4.222e-03  2.213e-04 -19.076  < 2e-16 ***
log(distance):log(distance_to_centre)    1.826e+02  5.067e+00  36.029  < 2e-16 ***
distance_to_centre:distance_to_centre_o -1.959e-06  9.171e-08 -21.365  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1308 on 2799 degrees of freedom
Multiple R-squared:  0.6797,    Adjusted R-squared:  0.6788 
F-statistic: 742.6 on 8 and 2799 DF,  p-value: < 2.2e-16
Robinlovelace commented 4 years ago

Zoning system that could be used alongside this:

image

Robinlovelace commented 4 years ago

Latest on this approach for Accra:


# packages ----------------------------------------------------------------

remotes::install_github("zonebuilders/zonebuilder")
#> Skipping install of 'zonebuilder' from a github remote, the SHA1 (239cede3) has not changed since last install.
#>   Use `force = TRUE` to force installation
remotes::install_github("ITSLeeds/od")
#> Skipping install of 'od' from a github remote, the SHA1 (6cb500ea) has not changed since last install.
#>   Use `force = TRUE` to force installation

library(zonebuilder)
library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright

# basic parameters --------------------------------------------------------

# idea: fundamental attributes per city

city_name = "accra"
has_od_data = FALSE
has_zone_data = FALSE

# get city data -----------------------------------------------------------

ne_cities = rnaturalearth::ne_download(type = "populated_places", returnclass = "sf", scale = 50)
#> OGR data source with driver: ESRI Shapefile 
#> Source: "/tmp/RtmpMBkTHs", layer: "ne_50m_populated_places"
#> with 1249 features
#> It has 119 fields
#> Integer64 fields read as strings:  wof_id ne_id
nrow(ne_cities)
#> [1] 1249
sum(ne_cities$POP_MAX) / 7e9 # around 20% of world's population
#> [1] 0.2108177

# for smaller cities
# ne_cities_10 = rnaturalearth::ne_download(type = "populated_places", returnclass = "sf", scale = 10)
# nrow(ne_cities_10)
# sum(ne_cities_10$POP_MAX, na.rm = TRUE) / 7e9 # around 33% of world's population

cities_match = grepl(pattern = city_name, x = ne_cities$NAME, ignore.case = TRUE)
n_city_name_in_city_data = sum(cities_match)

if(n_city_name_in_city_data == 1) {
  message("The city is in the city database" )
  city = ne_cities[cities_match, ]
}
#> The city is in the city database

region = osmdata::getbb(place_name = city_name, format_out = "sf_polygon")
#> No polygonal boundary for accra
if(is.null(region)) {
  message("Could not find boundary data from OSM. Getting data from another source.")
  if(city_name == "accra") {
    region = sf::read_sf("~/atfutures/who-data/accra/AccraMetroAdmin/AccraMetroAdmin2.shp")
  }
  # mapview::mapview(region)
}
#> Could not find boundary data from OSM. Getting data from another source.

city_centre = tmaptools::geocode_OSM(q = city_name, as.sf = TRUE)
mapview::mapview(region) + mapview::mapview(city_centre)

if(!has_zone_data) {
  city_zones = zonebuilder::zb_zone(x = city_centre, area = region)["label"]
  city_zone_centroids = sf::st_centroid(city_zones)
  zonebuilder::zb_plot(city_zones)
  plot(city_zone_centroids, add = TRUE, col = "grey")
}
#> although coordinates are longitude/latitude, st_contains assumes that they are planar
#> Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
#> although coordinates are longitude/latitude, st_within assumes that they are planar
#> Warning in st_centroid.sf(city_zones): st_centroid assumes attributes are
#> constant over geometries of x
#> Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon =
#> of_largest_polygon): st_centroid does not give correct centroids for longitude/
#> latitude data
#> Error in `[[<-.data.frame`(`*tmp*`, i, value = character(0)): replacement has 0 rows, data has 40

# generate od data --------------------------------------------------------

od_region = od::points_to_od(city_zone_centroids, interzone_only = TRUE)
nrow(od_region) # 1560
#> [1] 1560
od_region = od::od_oneway(od_region)
nrow(od_region) # 780
#> [1] 780
head(od_region)
#>     o   d
#> 1   A B01
#> 2   A B02
#> 3 B01 B02
#> 4   A B03
#> 5 B01 B03
#> 6 B02 B03
od_sf = od::od_to_sf(x = od_region, z = city_zone_centroids)
plot(od_sf)

Created on 2020-03-09 by the reprex package (v0.3.0)

Robinlovelace commented 4 years ago

Zones used for the OD matrices above:

image

Robinlovelace commented 4 years ago

Distances from city centre:

image

Robinlovelace commented 4 years ago

Resulting OD data

image

mpadge commented 4 years ago

That's great stuff Robin - I'll go through your code this week, and also plan on getting back to inputs from my side. Hopefully we'll now be able to pull this together fairly quickly.

Robinlovelace commented 4 years ago

Result from applying the simple model in sim-test.R to this data, using only distances to estimate flow (this will be much better when explanatory variables such as number of shops/buildings from OSM are added):

image

Robinlovelace commented 4 years ago

Keeping only flows with more than 100 movements estimated:

image

Robinlovelace commented 4 years ago

This shows the importance of sampling on the road network for each OD pair:

image

Robinlovelace commented 4 years ago

Acrra network:

image

Robinlovelace commented 4 years ago

Interesting issue around selecting the centroid. Realising that a single centroid per zone makes little sense with routing and that you can sample from the network.

Next step is to sample a different point in each origin and destination zone for each OD pair and re-calculate the routes, then we can use the PCT's Go Dutch scenario from the pct R package and get close to health estimates of mode shift, at least for 1 mode and 1 spatially-disaggregated scenario of change.

The things that need doing for this to work well, with approximate timings are:

Recalibrate spatial interaction model using variables derived from OSM data such as number of shops, schools, buildings per zone, for more accurate measure of flow per zone (1 day)

Calculate OD pairs going not to zone centroids but points in each zone sampled from the network (0.5 day)

Route on the network for each OD pair for different modes of transport, currently using ORS but Google alway works fine in Accra (1 day)
Robinlovelace commented 4 years ago

Heads-up @mpadge this is the route network that results from running the approach on only the top 35 OD pairs using the approach outlined above, under the Go Dutch scenario parameterised based on Dutch data in the PCT:

image

Although the network is bitty, I'm confident that if we increase the number of OD pairs and/or tweak the zoning system, this approach will yield the spatially disaggregated health impact results, under different scenarios of change, that we need for this project. Thoughts?

To reproduce, try running the code in the pct-approach.R file in upthat, will push the code now.

mpadge commented 4 years ago

That looks fabulous @Robinlovelace, thanks loads for the great input. I'll re-run it, and work out how easy it is to disaggregate back to the segment level. I expect that should be relatively straightforward. Definitely expect to deliver this coming week