Open szaboildi opened 2 years ago
Hi @szaboildi, sorry i somehow completely missed this issue. I'm not sure what you mean with the bug here; could you please explain a bit more? You show journeys with transfers, and they both match. Can you please explain why this is different from what you expect? Thanks
Hi @mpadge , thanks for getting back to me! The issue is basically that this trip shows up as requiring 0 transfers (the max_transfers
parameter is set to 0 in the gtfs_route()
function at the top and thentransfers
column in ttimes
at the bottom is also 0) even though when details of the trip are accessed, it seems like it involves 3 routes (292, 23, 191-- in the route_name column) -- which actually matches the fastest route on google maps.
(The output of the gtfs_traveltimes()
function in this case is likely the same trip that the gtfs_route()
function finds, because the starting time and duration matches.)
But the documentation for the gtfs_route
function explains the max_transfers
parameter like this:
If not NA, specify a desired maximum number of transfers for the route (including but not exceeding this number). This parameter may be used to generate alternative routes with fewer transfers, although actual numbers of transfers may still exceed this number if a value is specified which exceeds the minimal feasible number of transfers.
Setting max_transfers = N
only has any effect if it is possible to travel along a route with that number of transfers, in which case it will ensure that trips with any greater number of transfers are not returned, even where these may be faster. In your case, 0 transfers are simply impossible, and you get by default the minimal number which is 2.
I probably should update that documentation to change
if a value is specified which exceeds the minimal feasible number of transfers.
to the correct version of
if the value specified is less than the minimal feasible number of transfers.
Thank you for the explanation, that explains the gtfs_route()
part of the problem perfectly! Why does the gtfs_traveltimes()
function return this trip with 0 transfers though? Shouldn't the query in the last lineline (below) result in an empty df?
ttimes <- gtfs_traveltimes(ham_gtfs, from_trip, start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), day = day, max_traveltime = 60 * 60, max) ttimes[ttimes$stop_name == "Langobardenweg" & ttimes$ntransfers == 0,]
Yes, that gtfs_travetimes
result does possibly reflect a bug. I'll keep this issue open and investigate that further soon. Thanks
@szaboildi Sorry that it's taken so long to get back to this. The above commit fixes the bug you uncovered, and you should now see this:
library (gtfsrouter)
packageVersion ("gtfsrouter")
#> [1] '0.0.5.91'
url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- "./feeds/HVV.zip"
if (!file.exists (file)) {
download.file(url, file, quiet = FALSE)
|
from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- 8 * 3600
day_trip <- 3 # Tuesday
ham_gtfs <- extract_gtfs(file)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
ham_gtfs <- gtfs_transfer_table (ham_gtfs)
route <- gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0)
# same as above: lists 2 transfers, and takes 39 minutes
ttimes <- gtfs_traveltimes(ham_gtfs, from_trip,
start_time_limits = c(8 * 60 * 60, 9 * 60 * 60),
day = day_trip, max_traveltime = 60 * 60)
ttimes[ttimes$stop_name == to_trip, ]
#> start_time duration ntransfers stop_id stop_name
#> 1163 08:03:00 00:39:00 2 de:02000:91028::910093 Langobardenweg
#> 1168 08:03:00 00:36:00 4 de:02000:91028::910098 Langobardenweg
#> stop_lon stop_lat
#> 1163 9.966597 53.63475
#> 1168 9.966927 53.63527
Created on 2022-08-16 by the reprex package (v2.0.1)
That now lists the correct number of transfers, along with an additional possibility to get there 3 minutes quicker with 4 transfers.
The traveltimes algorithm was not counting "implicit" transfers to different services (that is, "trip_id" values) from the same platform ("stop_id" values). The above commit identifies these changes of service, and appropriately increments the aggregated number of transfers.
I now have to re-open this issue, because the gtfs_route()
function then reveals another bug here. Setting max_transfers >= 4
should give the 4-transfer route, but instead returns this incorrect result:
library (gtfsrouter)
url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- "./feeds/HVV.zip"
if (!file.exists (f)) {
download.file(url, file, quiet = FALSE)
}
from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- 8 * 3600
day_trip <- 3 # Tuesday
ham_gtfs <- extract_gtfs(file)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
ham_gtfs <- gtfs_transfer_table (ham_gtfs)
gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 4)
#> route_name trip_name stop_name arrival_time
#> 1 S1 Wedel Hamburg Airport (Flughafen) 08:03:00
#> 2 S1 Wedel Ohlsdorf 08:06:00
#> 3 U1 Norderstedt Mitte Ohlsdorf 08:17:00
#> 4 U1 Norderstedt Mitte Klein Borstel 08:18:00
#> 5 U1 Norderstedt Mitte Fuhlsbüttel 08:19:00
#> 6 U1 Norderstedt Mitte Fuhlsbüttel Nord 08:21:00
#> 7 U1 Norderstedt Mitte Langenhorn Markt 08:23:00
#> 8 193 U Garstedt U Langenhorn Markt (Krohnstieg) 08:26:00
#> 9 193 U Garstedt Rittmerskamp 08:27:00
#> 10 193 U Garstedt Samlandweg 08:28:00
#> 11 193 U Garstedt Wrangelkoppel 08:29:00
#> 12 193 U Garstedt Ermlandweg 08:30:00
#> 13 191 Grothwisch Krohnstiegtunnel 08:34:00
#> 14 191 Grothwisch Sperlingsweg 08:36:00
#> 15 191 Grothwisch Moorrand 08:37:00
#> 16 191 Grothwisch Grotkoppelweg 08:38:00
#> 17 191 Grothwisch Langobardenweg 08:39:00
#> 18 <NA> (transfer) S Hamburg Airport 00:44:00
#> departure_time
#> 1 08:03:00
#> 2 08:09:00
#> 3 08:17:00
#> 4 08:18:00
#> 5 08:19:00
#> 6 08:21:00
#> 7 08:23:00
#> 8 08:26:00
#> 9 08:27:00
#> 10 08:28:00
#> 11 08:29:00
#> 12 08:30:00
#> 13 08:34:00
#> 14 08:36:00
#> 15 08:37:00
#> 16 08:38:00
#> 17 08:39:00
#> 18 <NA>
That seems to be just a final error with the returned order of the stops along the route.
That last commit removes any transfers from being traced from the set of start stations, and so fixes the bug here. Running gtfs_traveltimes()
still gives the same answer as above, with 2 transfers for 39 minutes, or 4 for 36, but the latter is no longer the incorrectly-timed version that started with a transfer. The gtfs_route()
function still gives that, but that will be updated via #71, to allow full tracing of each connection listed in gtfs_traveltimes()
.
Works as described, thank you!
@mpadge As I mentioned here, I think I've found a wrong behavior in the gtfs_traveltimes
function and it looks like it's part of this bug here. It's easily demonstrated with your example you posted before:
@szaboildi Sorry that it's taken so long to get back to this. The above commit fixes the bug you uncovered, and you should now see this:
library (gtfsrouter) packageVersion ("gtfsrouter") #> [1] '0.0.5.91' url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip" file <- "./feeds/HVV.zip" if (!file.exists (file)) { download.file(url, file, quiet = FALSE) | from_trip <- "S Hamburg Airport" to_trip <- "Langobardenweg" start_time_trip <- 8 * 3600 day_trip <- 3 # Tuesday ham_gtfs <- extract_gtfs(file) #> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive #> ▶ Extracting GTFS feed✔ Extracted GTFS feed #> ▶ Converting stop times to seconds✔ Converted stop times to seconds #> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds ham_gtfs <- gtfs_transfer_table (ham_gtfs) route <- gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0) # same as above: lists 2 transfers, and takes 39 minutes ttimes <- gtfs_traveltimes(ham_gtfs, from_trip, start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), day = day_trip, max_traveltime = 60 * 60) ttimes[ttimes$stop_name == to_trip, ] #> start_time duration ntransfers stop_id stop_name #> 1163 08:03:00 00:39:00 2 de:02000:91028::910093 Langobardenweg #> 1168 08:03:00 00:36:00 4 de:02000:91028::910098 Langobardenweg #> stop_lon stop_lat #> 1163 9.966597 53.63475 #> 1168 9.966927 53.63527
Created on 2022-08-16 by the reprex package (v2.0.1)
That now lists the correct number of transfers, along with an additional possibility to get there 3 minutes quicker with 4 transfers.
This result still looks correct when running it with the latest master
version (0.0.5.123
), however if you specify gtfs_traveltimes(..., minimise_transfers = TRUE)
, you can see the transfers are still being calculated wrong:
ttimes_mintransfer <- gtfs_traveltimes(ham_gtfs, from_trip,
start_time_limits = c(8 * 60 * 60, 9 * 60 * 60),
day = day_trip, max_traveltime = 60 * 60,
minimise_transfers = TRUE)
ttimes_mintransfer[ttimes_mintransfer$stop_name == to_trip, ]
#> start_time duration ntransfers stop_id stop_name stop_lon stop_lat
#> 1353 08:33:00 00:39:00 0 de:02000:91028::910093 Langobardenweg 9.966597 53.63475
#> 1358 08:33:00 00:45:00 1 de:02000:91028::910098 Langobardenweg 9.966927 53.63527
Let me know if you need any more info or anything. Cheers!
Shall do @jh0ker, thanks! I recently also noticed odd behaviour when calling with minimise_transfers = TRUE
.I'll re-open now and hopefully fix asap.
Hi! I found a bug with how transfers are counted. I am using the new Sept 2021 - Dec 2021 HVV GTFS found here. I am using gtfsrouter 0.0.5.042 with R version 4.0.5 When I read this file in
This issue persists even when
gtfs_route()
is run on a timetable and not the gtfs itself.This issue is there in the traveltime table as well
Is this a bug or could this be an issue with the particular GTFS somehow?