OpenDataDayZurich2016 / ideas

Collection of ideas for the ODD
4 stars 0 forks source link

Bus Bunching #14

Open DominikGroegler opened 7 years ago

DominikGroegler commented 7 years ago

Bus bunching can be a consequence of traffic congestions, inoperable prioritisation of public transport vehicles or construction work on lines with small headway. Because headway is small, the delay of vehicles will influence each other: The delay of one vehicle will likely cause a delay of the following vehicles. This mechanism leads to an ongoing accumulation of delays. Where, when in the course of a day and why does bus bunching occur? Can bus bunching be predicted?

HeidiSeibold commented 7 years ago

This is an extremely complex question. There is a spacial component but also a time component.

Maybe it would be easiest to start with one bus line and one bus stop. Is there one line and one stop that is particularly interesting?

DominikGroegler commented 7 years ago

Bus bunching occurs primarily during peak hours, 07:00 – 09:00 and 16:00 -19:00. A good point to start are these two bus lines: Line 31: Hardplatz --> Farbhof and HB --> Hegibachplatz Line 80: Bahnhof Altstetten --> Oerlikon Nord In November 2016 there were a lot of occations of bus bunching, so it would be good to start analysing in that time period.

HeidiSeibold commented 7 years ago

Any stop that is particularly interesting?

DominikGroegler commented 7 years ago

As a starting point one stop along the mentioned stops is as good as any other. However, bus bunching is a phenomenon that affects the complete line and not a particular stop. The aim for us (VBZ) is to find a way to control the traffic flow of a complete line, to make service more regular. Therefore, we should not focus on one particular stop, rather we have to look at all the stops e.g. from HB --> Hegibachplatz.

krlmlr commented 7 years ago

I remember an article quoting that lines 32/61/62 are most susceptible to bus bunching.

What data are we going to use to understand why bus bunching occurs?

HeidiSeibold commented 7 years ago

https://data.stadt-zuerich.ch/dataset/vbz-fahrzeiten-ogd

3lainess commented 7 years ago

Are there data on construction projects, or traffic accidents? If we had these data, we might be able to examine possible relationships. It may be that bus bunching has other causes as well, but this could be helpful.

HeidiSeibold commented 7 years ago

https://data.stadt-zuerich.ch/dataset/tiefbaustelle maybe?

3lainess commented 7 years ago

Ok. Well that's cool. That could be a start. Could identify construction sites along the route of a particular bus, and then see if construction contributes to bunching. (Gross simplification, i know). Use a hold out sample to test the model. I guess I'd start with just one bus route to simplify things.

(I apologize for any lack of skill in githubbing - i'm just learning it). 🙄

3lainess commented 7 years ago

I downloaded the data. It's 49 large building projects, most over several months/years. There's a shapefile, it creates a layer to superimpose over a zurich street map.

HeidiSeibold commented 7 years ago

A starting point:

library("ggplot2")

source("https://raw.githubusercontent.com/OpenDataDayZurich2016/download_vbz_data/master/clean_delay_data.R")
source("https://raw.githubusercontent.com/OpenDataDayZurich2016/download_vbz_data/master/download_delay_data.R")

## load timeslots data.frame
timeslots <- read.csv(file = "data/delay_data/timeslots_files.csv", 
                      colClasses = c("factor", "POSIXct", "POSIXct", 
                                     "character", "character"))

## Download all available files !!! takes very long !!!
## If possible, take from USB stick
# source("download_delay_data.R")
# lapply(timeslots$from, download_delay_data, timeslots = timeslots)

## load data of given times
mydat <- import_clean_delaydata(from = as.POSIXlt("2016-01-03 00:00:00 CEST"),
                                to = as.POSIXlt("2016-01-15 24:00:00 CEST"),
                                which_info = "datetime_soll_an_von",
                                timeslots = timeslots)

## bus bunching line 80 at Glaubtenstrasse
bus80glau0 <- subset(mydat, linie == 80 & halt_kurz_von1 == "GLAU" & seq_von == 24)

## sort by ist_ab_von
bus80glau <- bus80glau0[order(bus80glau0$datetime_ist_ab_von), ]
bunched <- diff(bus80glau$datetime_ist_ab_von, differences = 1) < as.difftime(60, units = "secs")
3lainess commented 7 years ago

Thanks! So for example, you can genereate a list of stops, and compute deviation from scheduled time to see where bunching occurs.