Closed mpadge closed 3 years ago
@AlexandraKapp The above commit changes isochrone calculation so that the value at each end point is calculated relative to the start time of the actual journey that led to that end point, rather than previous version in which all times were relative to a common time defined as first trip leaving from
after specified time.
The auto-zoom means they're not quite overlaid, but you can see that the new version extends further, because the end points are only stopped once they exceed the specified isochrone value after the start of those actual journeys.
looks good!
How would you know additionally the exact duration for each stop? Would it make sense to change the column earliest_arrival
to duration
in mid_points
and end_points
? Or add another column departure_time
?
The preceding 2 commits now give this:
library (gtfsrouter)
gtfs <- extract_gtfs("vbb.zip")
#> ▶ Unzipping GTFS archive
#> ✔ Unzipped GTFS archive
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
gtfs <- gtfs_timetable (gtfs, day = "Monday")
from <- "Berlin, Amalienstr"
start_time <- 11 * 3600
end_time <- start_time + 15 * 60
r <- gtfs_isochrone (gtfs, from = from, start_time = start_time, end_time = end_time)
#> Loading required namespace: geodist
#> Loading required namespace: lwgeom
#> Registered S3 method overwritten by 'spatstat':
#> method from
#> print.boxx cli
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.2
r$end_points
#> Simple feature collection with 26 features and 6 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 13.2782 ymin: 52.40379 xmax: 13.3721 ymax: 52.45409
#> geographic CRS: WGS 84
#> First 10 features:
#> stop_name stop_id departure
#> 1 Berlin, Karwendelstr. 070101002873 11:06:00
#> 2 Berlin, Breite Str. 070101004111 11:04:00
#> 3 Berlin, Greinerstr. 070101003027 11:04:00
#> 4 Berlin, Herwarthstr. 070101002665 11:06:00
#> 5 Teltow, Ev. Diakonissenhaus 070101007462 11:06:00
#> 6 Berlin, Feldstr. 070101001321 11:06:00
#> 7 Berlin, Siemensstr./Halskestr. 070101000144 11:04:00
#> 8 Berlin, Friedrichrodaer Str. 070101000710 11:04:00
#> 9 Berlin, Unter den Eichen/Drakestr. 070101002993 11:06:00
#> 10 S Lichterfelde Süd (Berlin) [Bus Réaumurstr.] 070101000076 11:06:00
#> arrival duration transfers geometry
#> 1 11:15:00 00:09:00 2 POINT (13.30679 52.43304)
#> 2 11:18:00 00:14:00 0 POINT (13.32904 52.45409)
#> 3 11:18:00 00:14:00 1 POINT (13.3721 52.43668)
#> 4 11:20:00 00:14:00 3 POINT (13.31808 52.43209)
#> 5 11:20:00 00:14:00 0 POINT (13.2782 52.40379)
#> 6 11:20:30 00:14:30 1 POINT (13.30354 52.41569)
#> 7 11:14:00 00:10:00 1 POINT (13.3428 52.4471)
#> 8 11:18:00 00:14:00 2 POINT (13.35879 52.42044)
#> 9 11:20:00 00:14:00 1 POINT (13.29339 52.44588)
#> 10 11:19:00 00:13:00 1 POINT (13.3114 52.41099)
Created on 2020-10-26 by the reprex package (v0.3.0)
Those commits convert the whole calculation of isochrones from tracing absolute times to tracing a vector of elapsed_time
values for each station, calculated relative to the first departure from the nominated from
point(s) to that station. The first arriving trip to each station is recorded, but these are also replaced by any subsequent ones with overall lower elapsed_time
values, ensuring that latest possible start time will always be chosen for any given end time. The final absolute times in the return object correspond to the first trip with minimal elapsed time, ensuring that, for a bunch of equivalent trips the first one will always be returned.
library (gtfsrouter)
gtfs <- extract_gtfs("vbb.zip")
#> ▶ Unzipping GTFS archive
#> ✔ Unzipped GTFS archive
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
gtfs <- gtfs_timetable (gtfs, day = "Monday")
from <- "Berlin, Amalienstr"
start_time <- 11 * 3600 + 0 * 60
end_time <- start_time + 10 * 60
r <- gtfs_isochrone (gtfs, from = from, start_time = start_time, end_time = end_time)
#> Loading required namespace: lwgeom
#> Registered S3 method overwritten by 'spatstat':
#> method from
#> print.boxx cli
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.2
r$end_points
#> Simple feature collection with 9 features and 6 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 13.30214 ymin: 52.41436 xmax: 13.35007 ymax: 52.44425
#> geographic CRS: WGS 84
#> stop_name stop_id departure arrival duration
#> 1 Berlin, Karwendelstr. 070101000397 11:06:00 11:15:00 00:09:00
#> 2 Berlin, Ostpreußendamm Süd 070101001322 11:06:00 11:15:00 00:09:00
#> 3 S Osdorfer Str. (Berlin) 070101002577 11:06:00 11:15:00 00:09:00
#> 4 S Osdorfer Str. (Berlin) 060064256621 11:06:00 11:15:42 00:09:42
#> 5 Berlin, Grabenstr. 070101000388 11:06:00 11:15:00 00:09:00
#> 6 Berlin, Stanzer Zeile 070101003072 11:06:00 11:15:00 00:09:00
#> 7 Berlin, Eiswaldtstr. 070101000881 11:04:00 11:13:00 00:09:00
#> 8 Berlin, Leonorenstr./Siemensstr. 070101004143 11:04:00 11:12:00 00:08:00
#> 9 Berlin, Paul-Schneider-Str. 070101000705 11:04:00 11:13:00 00:09:00
#> transfers geometry
#> 1 1 POINT (13.30679 52.43304)
#> 2 0 POINT (13.30213 52.41436)
#> 3 1 POINT (13.31386 52.41837)
#> 4 1 POINT (13.31386 52.41837)
#> 5 1 POINT (13.32649 52.42126)
#> 6 1 POINT (13.33818 52.41664)
#> 7 1 POINT (13.34704 52.42858)
#> 8 0 POINT (13.33899 52.44425)
#> 9 1 POINT (13.35007 52.43425)
start_time <- 11 * 3600 + 5 * 60
end_time <- start_time + 10 * 60
r <- gtfs_isochrone (gtfs, from = from, start_time = start_time, end_time = end_time)
r$end_points
#> Simple feature collection with 8 features and 6 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 13.30214 ymin: 52.41436 xmax: 13.36118 ymax: 52.4473
#> geographic CRS: WGS 84
#> stop_name stop_id departure arrival duration
#> 1 Berlin, Leonorenstr./Siemensstr. 070101004143 11:09:00 11:18:00 00:09:00
#> 2 S Attilastr. (Berlin) 070101003427 11:09:00 11:18:00 00:09:00
#> 3 Berlin, Karwendelstr. 070101000397 11:06:00 11:15:00 00:09:00
#> 4 Berlin, Ostpreußendamm Süd 070101001322 11:06:00 11:15:00 00:09:00
#> 5 S Osdorfer Str. (Berlin) 070101002577 11:06:00 11:15:00 00:09:00
#> 6 S Osdorfer Str. (Berlin) 060064256621 11:06:00 11:15:42 00:09:42
#> 7 Berlin, Grabenstr. 070101000388 11:06:00 11:15:00 00:09:00
#> 8 Berlin, Stanzer Zeile 070101003072 11:06:00 11:15:00 00:09:00
#> transfers geometry
#> 1 1 POINT (13.33899 52.44425)
#> 2 0 POINT (13.36118 52.4473)
#> 3 1 POINT (13.30679 52.43304)
#> 4 0 POINT (13.30213 52.41436)
#> 5 1 POINT (13.31386 52.41837)
#> 6 1 POINT (13.31386 52.41837)
#> 7 1 POINT (13.32649 52.42126)
#> 8 1 POINT (13.33818 52.41664)
Created on 2020-10-26 by the reprex package (v0.3.0)
The second of those shows that starting the isochrone 5 minutes later drops the couple that started at 11:04, and replaces with them subsequent services which start at 11:09, yet leaves all the other ones starting at 11:06 intact. The scan horizon is hard-coded here: https://github.com/ATFutures/gtfs-router/blob/7b6de065796930f3c6c36746cabf8500c9025969/src/csa-isochrone.cpp#L127-L129
The value is the actual start time from a given start (from
) station plus twice the isochrone duration. That determines when the scan stops, but as described values are only replaced if the overall elapsed_time
is shorter, otherwise the first service is recorded. That means that results are effectively independent of the precise value hard-coded in the above line. Shorter horizons can be used, and may offer some speed gains because less of the timetable has to be scanned, but at a risk of potentially missing some connections. I think the chosen value is sensible, and am happy to leave as is.
the adaption of the algorithms seems very reasonable. I was wondering if the parameters of the gtfs_isochrone function should be adapted, to better reflect the new calculation? As "end_time" does not really capture the functionality properly anymore.
#' @param start_time Desired departure time at from station
#' @param max_duration maximum travel time duration considered for isochrone
#' @param latest_start_time Latest possible departure at from station. Default: twice the isochrone duration after start_time
gtfs_isochrone <- function (gtfs, start_time, max_duration, latest_start_time = NA) {
if(is.na(latest_start_time))
latest_start_time = 2*max_duration + start_time
...
}
for backwards compatibility keeping end_time with a deprecation warning might make sense:
gtfs_isochrone <- function (gtfs, start_time, max_duration = NA, latest_start_time = NA, end_time = NA) {
if(is.na(max_duration)) {
if(!is.na(end_time)) {
message("end_time will be deprecated. In future use max_duration instead.")
max_duration <- end_time - start_time
}
}
if(is.na(latest_start_time))
latest_start_time = 2*max_duration + start_time
...
}
Yeah, good point. The documentation definitely needs to be updated, and i think these more sophisticated isochrone calculates probably mean that the whole thing should be its own separate vignette. So i'll re-open the issue as a prompt to do that, and only close when the new vignette is there.
This has all been effectively addressed in the new gtfs_traveltimes()
function and algorithms, which will also lead to deprecation of the former gtfs_isochrone()
function. New vignette issue is #55, so closing now.
That is currently not done, so isochrone values do not necessarily reflect the fastest possible trip to each end point