Kalman Filter for Dwell Times

This framework describes the steps to improve dwell time prediction accuracy in TheTransitClock. The assumptions and trade-offs built into the Kalman Filter for Dwell Times introduced in Farhan and Shalaby (2004) are discussed. Relevant studies regarding the impact of these decisions on prediction accuracy are cited.

Introduction

Predicting the arrival time of transit vehicles requires estimating the successive travel and dwell times that make up the trip. In the latest version of TheTransitClock, travel times are estimated with a Kalman Filter, which calibrates historical data inputs from the last week with immediate data from the last vehicle. TheTransitClock currently uses the Historical Average method to predict dwell times without considering the headway. However, it is known that on high-frequency routes, boarding times tend to increase with longer headways as more passengers have time to arrive at stops Levinson (1983). Farhan and Shalaby (2004) introduced the Kalman Filter for travel and dwell times, which calculates passenger arrival rates in real-time based on historical and immediate data.

The flowchart below describes the dwell time component of their method. Using Automated Vehicle Location (AVL) and Automated Passenger Count (APC) data, historical and real-time passenger arrival rates are estimated. The Kalman Filter weighs the two inputs based on the live error to produce a calibrated passenger arrival rate. This rate is multiplied by the vehicle’s headway to obtain a predicted dwell time. Finally, the dwell and travel time estimates are added to predict the arrival time. The assumptions built into each step, from the prediction back to the data, are described below.

Prediction

The capacity to improve prediction accuracy with a more sophisticated dwell time model hinges on the modeling assumptions highlighted above. While some studies, such as Hans et al. (2015), have found a significant improvement in prediction accuracy when incorporating a dwell time model, others have not. For example, Cats and Loutos (2016) compared a Historical Average prediction method with the Headway-Based dwell time model. The authors found no improvement in prediction accuracy, possibly due to the assumptions that passenger arrivals at stops were Poisson distributed and that no other factors affected dwell times.

Recommendation - Quantify the proportion of prediction error that is due to dwell times. This new benchmark will then be used to debug and test different model configurations in playback mode.

Run Time Model

The Run Time Model is composed of a travel time and a dwell time model. Dwell time can be split into several components:

Boarding time - assumed to be 2.5 seconds per passenger in Farhan and Shalaby (2004). Sun et al. (2014) find that the time per boarding is influenced by the number of boarding passengers, the vehicle occupancy, and the vehicle type.
Alighting time - can be estimated as a binomial distribution with the passenger load and probability of drop off at each stop (Hans et al. 2015).
Adherence dwell - depends on on-time performance and whether the stop is a time point.
Stop-service time - fixed dwell time for passenger activity regardless of the number of boarding and alighting passengers. Note that there is no stop-service time if no passengers board or alight the bus. Therefore, modeling the probability of boarding or alighting may help prediction accuracy on low-frequency routes.
Stop zone time - obligatory dwell time, which takes place even if there are no boardings or alightings

Recommendation - Ensure that the different components of dwell times are accounted for in the prediction algorithm.

Kalman Filter

The Kalman Filter algorithm for travel time trades-off the stability of historical data from the last week for the immediacy of real-time data from the last vehicle. When a disruption happens, data from the immediate past can better represent current operating conditions even if based on a sample size of one.

In the Kalman Filter for dwell times, an added step consists in dividing the passenger boardings by the headway of the last vehicle at the current stop. If that headway is short, then a few passenger boardings can have an outsize effect on the estimated passenger arrival rate. This is likely to happen due to the inherent variability and potential biases of the Poisson model. Under the assumption of Poisson arrivals, the variance of this estimate is equal to its mean, λ. Furthermore, if the Poisson assumption is violated, the immediate data may produce a biased estimate. Consider, for example, passengers boarding the second of two bunched vehicles to avoid crowding. The Kalman Filter, which gives more weight to the immediate data when it differs from historical data, could then pass on the biases into the calibration and prediction.

Recommendation - Instead of estimating the passenger arrival rates based on a single headway, the total passenger boardings could be divided by a longer horizon encompassing several successive headways. Analysis of autocorrelation in passenger rates would be required to find a horizon making the optimal trade-off. A headway too short may be unstable while a headway too long may no longer represent current operating conditions.

Arrival Rate

In Farhan and Shalaby (2004), passenger arrival rates at a particular stop are calculated as the number of passenger boardings divided by the headway. The method assumes passengers arrive at stops independently of the schedule according to a Poisson process. This assumption was found to hold true by Fan and Machemehl (2009) when headways are less than 12-minutes. Farhan and Sahalaby (2004) tested their method during peak-hour on a route with 12-minute headways. However, on low-frequency routes and during off-peak hours, when passengers tend to coordinate their arrivals with the schedule, the method may not be adapted. Furthermore, with the availability of real-time information, passengers are even less likely to arrive randomly at stops (Watkins et al., 2011). Empirical studies have shown that on routes with longer headways, passenger arrivals tend to follow a beta distribution (Ingvardson et al., 2018).

Recommendation - Test different models of passenger arrival rates based on headways and times-of-day.

Data

Recommendation - The components described in this framework can be implemented sequentially. In the first stage, the dwell time for the previous vehicle could be used to infer the passenger arrival rate. The components and interfaces built in the first stage, including the caching system, and the models for dwell time and passenger arrival rates could be leveraged in the second stage once real-time APC data become available.

References

Cats, O., & Loutos, G. (2016). Evaluating the added-value of online bus arrival prediction schemes. Transportation Research Part A: Policy and Practice, 86, 35-55.
Fan, W., & Machemehl, R. B. (2009). Do transit users just wait for buses or wait with strategies? Some numerical results that transit planners should see. Transportation Research Record, 2111(1), 169-176.
Hans, E., Chiabaut, N., Leclercq, L., & Bertini, R. L. (2015). Real-time bus route state forecasting using particle filter and mesoscopic modeling. Transportation Research Part C: Emerging Technologies, 61, 121-140.
Ingvardson, J. B., Nielsen, O. A., Raveau, S., & Nielsen, B. F. (2018). Passenger arrival and waiting time distributions dependent on train service frequency and station characteristics: A smart card data analysis. Transportation Research Part C: Emerging Technologies, 90, 292-306.
Levinson, H. S. (1983). Analyzing transit travel time performance (No. 915).
Shalaby, A., & Farhan, A. (2004). Prediction model of bus arrival and departure times using AVL and APC data. Journal of Public Transportation, 7(1), 3.
Sun, L., Tirachini, A., Axhausen, K. W., Erath, A., & Lee, D. H. (2014). Models of bus boarding and alighting dynamics. Transportation Research Part A: Policy and Practice, 69, 447-460.
Watkins, K. E., Ferris, B., Borning, A., Rutherford, G. S., & Layton, D. (2011). Where Is My Bus? Impact of mobile real-time information on the perceived and actual wait time of transit riders. Transportation Research Part A: Policy and Practice, 45(8), 839-848.

TheTransitClock / transitime

A framework to improve dwell time predictions #236