r-transit / tidytransit

R package for working with GTFS data
https://r-transit.github.io/tidytransit/
150 stars 22 forks source link

Travel time with GTFS #187

Closed orlando-sabogal closed 2 years ago

orlando-sabogal commented 2 years ago

I have a GTFS file for Mexico City downloaded from transitland.

I want to use the GTFS to calculate travel times. Nevertheless, the GTFS does not seem to work with any of the R-based libraries (gtfsrouter and tidytransit). I have not been able to get travel times between two locations, between two stations, or between a transit station to any other station.

I originally asked the question on stackoverflow: here you can see the issue

polettif commented 2 years ago

Thanks for bringing this up. I think the problem is that your feed doesn't contain a transfers table. While a feed without transfers is unlikely to provide good routing results, travel_times should still work. So this is an issue we should fix.

In the meantime you can use this fix:

library(tidytransit)

gtfs <- read_gtfs("MexicoCity_TransitLand.zip")

gtfs$transfers <- data.frame()

stop_times <- filter_stop_times(gtfs, "2021-05-01")
travel_times(stop_times, "Las Torres", return_coords = TRUE, stop_dist_check = FALSE)
#> # A tibble: 97 × 12
#>    from_stop_name to_stop_name     travel_time journey_departu… journey_arrival…
#>    <chr>          <chr>                  <dbl> <time>           <time>          
#>  1 Las Torres     Las Torres                 0 00'00"           00'00"          
#>  2 Las Torres     Ciudad Jardín             70 01'22"           02'32"          
#>  3 Las Torres     Tasqueña                  82 23'35"           24'57"          
#>  4 Las Torres     Plaza Ermita              93 05'25"           06'58"          
#>  5 Las Torres     La Virgen                125 01'22"           03'27"          
#>  6 Las Torres     San Jerónimo - …         133 47'42"           49'55"          
#>  7 Las Torres     Penitenciaría            177 18'30"           21'27"          
#>  8 Las Torres     Xotepingo                183 01'22"           04'25"          
#>  9 Las Torres     Las Minas                200 05'25"           08'45"          
#> 10 Las Torres     Nezahualpilli            234 01'22"           05'16"          
#> # … with 87 more rows, and 7 more variables: transfers <dbl>,
#> #   from_stop_id <chr>, to_stop_id <chr>, from_stop_lon <dbl>,
#> #   from_stop_lat <dbl>, to_stop_lon <dbl>, to_stop_lat <dbl>

Another note: cluster_stops does not actually fix anything in a feed, it's just a tool to find clusters. That's why in the fix above stop_dist_check is set to false. I haven't looked at the feed further, so I don't know whether missing transfers is an issue.

polettif commented 2 years ago

Small correction: It is indeed possible to use cluster_stops to fix stop names if you use cluster_colname = "stop_name" (thus overwriting stop_name) as you did in your example. This is better than turning stop_dist_check off. Though I'd still recommend checking whether clusters make sense before doing further analysis.