Scenario development - Githubissues

ATFutures / who3

Third phase of WHO prototype

0 stars 0 forks source link

Scenario development #2

Closed mpadge closed 4 years ago

mpadge commented 5 years ago

GanttStart: 2019-07-29 GanttDue: 2019-09-28

Robinlovelace commented 5 years ago

Plan: create a .Rmd file (or repo called transportScenarios) that describe first in general terms what each of the scenarios in the spec will look like in words, and then quantify them in terms of crude %s, primarily change in % walking, cycling and potentially driving by distance bands (allowing the scenarios to be transport pattern agnostic and therefore 'global').

Note: this would benefit from a working multinomial logistic regression model. Any updates on that?

mpadge commented 5 years ago

How about a sub-dir here (transportScenarios) with a simple README.(R)md to describe them?

And no, no update on multinomial logistic model, but could rush that analysis reasonably quickly via bikedata. I need to do that for moveability, to quantify dependence of distance distribution on local infrastructure density - to reflect the effect of trips becoming longer further out from urban centres and vice-versa.

mpadge commented 5 years ago

Comments from WHO side:

On the scenarios: I would separate walking and cycling in the scenarios not only because interventions to promote them are different but also because situation on the ground for these two modes is also different. For instance, in both Accra and Kathmandu, only estimations of walking flows may already suffice to improve the business case as we see high levels of walking but very low levels of walking infrastructure (in Accra, less than 5% of roads with a sidewalk…). Conversely, cycling is virtually absent so perhaps a spatialized “propensity to cycle” scenario would make more sense. Furthermore, the beauty of the tool is precisely its spatial nature so I would try to have to the extend possible scenarios with a spatial component. Having said that, I also see value on global scenarios and we may work a bit on refining those later in the process.

Robinlovelace commented 5 years ago

Got that.

mpadge commented 4 years ago

Refs for effects of vehicle speed: I got it from this blog entry, and the paper is Is there such a thing as a ‘fair’ distribution of road space? (Journal of Urban Design 2019)

Robinlovelace commented 4 years ago

I can evisage a list of parameters for each city including:

% walking
% cycling (default: 1%)
% driving
% scooter
% minivan / informal shared
% public transport
% micromobility
% other
Average hilliness (% is easiest measure)
Distance of cycleways
Average speed of roads weighted by centrality of walking
Average speed of roads weighted by centrality of cycling
% of desire lines with 'good' public transport access
car parking spaces in city centre

I recall you had a place for such numbers, averaged. Of course it's imperfect because it takes no account of distance but I think that is a realistic ballpark starting point for most cities. I

Robinlovelace commented 4 years ago

Looking for something like this but with mode split:

Robinlovelace commented 4 years ago

And with intervals based on this, still deciding between mlogit and nnet pkgs. Thoughts @mpadge ?

library(nnet)
library(foreign)

ml = read.csv("https://github.com/rlowrance/re/raw/master/hsbdemo.csv")
ml$prog2 = relevel(ml$prog, ref = "academic")
test = multinom(prog2 ~ ses + write, data = ml)
#> # weights:  15 (8 variable)
#> initial  value 219.722458 
#> iter  10 value 179.983731
#> final  value 179.981726 
#> converged

require(effects)
#> Loading required package: effects
#> Loading required package: carData
#> lattice theme set by effectsTheme()
#> See ?effectsTheme for details.

fit.eff = Effect("ses", test, given.values = c("write" = mean(ml$write)))

data.frame(fit.eff$prob, fit.eff$lower.prob, fit.eff$upper.prob)
#>   prob.academic prob.general prob.vocation L.prob.academic L.prob.general
#> 1     0.7009046    0.1784928     0.1206026       0.5576700     0.09543306
#> 2     0.4396813    0.3581915     0.2021272       0.2967261     0.23102276
#> 3     0.4777451    0.2283359     0.2939190       0.3721127     0.15192409
#>   L.prob.vocation U.prob.academic U.prob.general U.prob.vocation
#> 1      0.05495259       0.8132864      0.3091378       0.2443995
#> 2      0.10891993       0.5933969      0.5090243       0.3442789
#> 3      0.20553482       0.5854062      0.3283022       0.4011208

^{Created on 2019-10-25 by the reprex package (v0.3.0)}

Robinlovelace commented 4 years ago

A multinomial logit model in action:

# vignette("e4mprobit")

library("mlogit")
#> Loading required package: Formula
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> Loading required package: lmtest
data("Mode", package="mlogit")
head(Mode, 2)
#>   choice cost.car cost.carpool cost.bus cost.rail time.car time.carpool
#> 1    car 1.507010     2.335612 1.800512   2.35892 18.50320     26.33823
#> 2   rail 6.056998     2.896919 2.237128   1.85545 31.31111     34.25696
#>   time.bus time.rail
#> 1 20.86779  30.03347
#> 2 67.18189  60.29313
Mo = mlogit.data(Mode, choice = 'choice', shape = 'wide', varying = c(2:9))
p1 = mlogit(choice ~ cost + time, Mo, seed = 20, R = 100, probit = TRUE)
p1
#> 
#> Call:
#> mlogit(formula = choice ~ cost + time, data = Mo, probit = TRUE,     R = 100, seed = 20)
#> 
#> Coefficients:
#>     car:(intercept)  carpool:(intercept)     rail:(intercept)  
#>            1.830866            -1.281682             0.309351  
#>                cost                 time          car.carpool  
#>           -0.413440            -0.046655             0.259972  
#>            car.rail      carpool.carpool         carpool.rail  
#>            0.736487             1.307895            -0.798184  
#>           rail.rail  
#>            0.430130
m_test = Mode[1, ]
m_test$cost.car = 9
m_test$cost.carpool = 0.1

Mo1 = mlogit.data(m_test, choice = 'choice', shape = 'wide', varying = c(2:9))
predict(p1, Mo1)
#>        bus        car    carpool       rail 
#> 0.42916819 0.03140022 0.23911232 0.34734329

m_test$cost.car = 2
m_test$cost.carpool = 5

Mo1 = mlogit.data(m_test, choice = 'choice', shape = 'wide', varying = c(2:9))
predict(p1, Mo1)
#>          bus          car      carpool         rail 
#> 0.0310215956 0.9613644635 0.0007631103 0.0110329054

^{Created on 2019-10-25 by the reprex package (v0.3.0)}

Robinlovelace commented 4 years ago

From Wikipedia:

# Aim: get mode split for cities

library(tidyverse)
d1 = htmltab::htmltab("https://en.wikipedia.org/wiki/Modal_share", which = 1)
#> Warning: Columns [Survey Area] seem to have no data and are removed. Use
#> rm_nodata_cols = F to suppress this behavior
d2 = htmltab::htmltab("https://en.wikipedia.org/wiki/Modal_share", which = 2)

names(d1)
#> [1] "City"                  "walking"               "cycling"              
#> [4] "public transport"      "private motor vehicle" "year"
names(d2)
#> [1] "City"                  "walking"               "cycling"              
#> [4] "public transport"      "private motor vehicle" "year"

d = bind_rows(d1, d2)
summary(d)
#>      City             walking            cycling         
#>  Length:123         Length:123         Length:123        
#>  Class :character   Class :character   Class :character  
#>  Mode  :character   Mode  :character   Mode  :character  
#>  public transport   private motor vehicle     year          
#>  Length:123         Length:123            Length:123        
#>  Class :character   Class :character      Class :character  
#>  Mode  :character   Mode  :character      Mode  :character
str_replace(d$walking[1], "%", "")
#> [1] "3"
dc = d %>% # clean data
  mutate_all(str_replace, "\\%", "") %>% 
  mutate_at(vars(-1), as.numeric) %>% 
  rename(car = `private motor vehicle`) %>% 
  arrange(car) %>% 
  mutate(n = 1:nrow(d))
#> Warning: NAs introduced by coercion
#> Warning: NAs introduced by coercion

#> Warning: NAs introduced by coercion

#> Warning: NAs introduced by coercion

#> Warning: NAs introduced by coercion
dt = dc %>% pivot_longer(cols = 2:(ncol(dc) - 2))

ggplot(dt) + geom_area(aes(n, value, fill = name))
#> Warning: Removed 5 rows containing missing values (position_stack).


# now: get predictors that policy makers control
# - speeds
# - car parking spaces

^{Created on 2019-10-26 by the reprex package (v0.3.0)}

Robinlovelace commented 4 years ago

See here for the document that reports this. Getting close to MVP I think. Please take a look + comment when you get a chance @mpadge.

I was thinking of getting explanatory variables, e.g. using data from here https://data.london.gov.uk/download/global-city-data/ffcefcba-829c-4220-911f-d4bf17ef75d6/global-city-indicators.xlsx and creating a simple multinomial regression model that allows 'top down' models of change. Next steps to suggest in the discussion will be estimating rates of change based on historic data.

Robinlovelace commented 4 years ago

Here's where I'm documenting this: https://github.com/ATFutures/who3/tree/master/scenarios

mpadge commented 4 years ago

That is all absolute gold - thanks so much for digging in to all this. Hopefully we can pull things together fairly quickly from here