Closed cseveren closed 3 years ago
Here's a reproducible example:
library(gtfs2gps)
library(dplyr)
library(data.table)
# read in feed and translate
gtfs_list <- read_gtfs("./gtfs.zip")
new_stop_times <- gtfs2gps(gtfs_list, parallel = T, spatial_resolution = 500)
# keep only observations with stop_id and replicate departure_time to arrival_time
new_stop_times <- subset(new_stop_times, !is.na(stop_id))
new_stop_times$arrival_time <- as.ITime(new_stop_times$departure_time)
# select columns
new_stop_times <- new_stop_times[, .(trip_id, departure_time, arrival_time, stop_id, stop_sequence)]
head(new_stop_times)
# update stop_times and drop frequencies.txt
gtfs_list$stop_times <- new_stop_times
gtfs_list$frequencies <- NULL
# selecting a particular non-missing trip_id at random
tt = gtfs_list$stop_times %>%
filter(trip_id==28992)
The object tt
created above has 4320 rows, corresponding to 180 instances of the trip each 24 stops long starting every 2:20 from 10am to about 5:35pm.
Prior discussions of this topic for OTP that may be useful: https://github.com/opentripplanner/OpenTripPlanner/issues/1347
Moreover, there may already be a useful tool here: https://atfutures.github.io/gtfs-router/reference/frequencies_to_stop_times.html
The function frequencies_to_stop_times{gtfsrouter} might be a useful reference here.
thanks for the issue @cseveren . In fact, the trip_id
does not change in the gtfs2gps::gtfs2gps
processing. That happens because there is no previous information of trip_id
's, as in simple (non-freq) GTFS formats.
We're fixing this by associating two columns (trip_id
, trip_number
) to create a unique trip_id
. See the example of Mexico City
> gps_data <- read_gtfs("gtfs.zip") %>%
+ filter_by_shape_id("14816") %>%
+ gtfs2gps()
> tmp_gps <- data.table::copy(gps_data)[!is.na(stop_id) & trip_id == "28992"]
> # number of 'trip_number'
> length(unique(tmp_gps$trip_number))
[1] 180
> # number of 'trip_id'
> length(unique(tmp_gps$trip_id))
[1] 1
> # adjustment
> tmp_gps[,trip_id := paste0(trip_id,"#",trip_number)]
> length(unique(tmp_gps$trip_id))
[1] 180
> # new stop_times
> tmp_gps[, arrival_time := departure_time]
> new_stop_times <- tmp_gps[, .(trip_id, departure_time, arrival_time, stop_id, stop_sequence)]
> head(new_stop_times)
trip_id departure_time arrival_time stop_id stop_sequence
1: 28992#1 10:00:00 10:00:00 14090 1
2: 28992#1 10:02:13 10:02:13 14089 2
3: 28992#1 10:03:48 10:03:48 14086 3
4: 28992#1 10:05:18 10:05:18 14085 4
5: 28992#1 10:06:56 10:06:56 14093 5
6: 28992#1 10:08:26 10:08:26 14092 6
Hi @cseveren , we believe this issue has been solved with the PR #214. We're closing this issue for now, but please don't hesitate to reopen it if you think the problem persists on your case.
I think that
trip_id
s should be unique within$trips
, such that pairs of (trip_id
,stop_sequence
) are unique within$stop_times
. However, when usinggtfs2gps
to convert a frequency-based GTFS feed to non-frequency-base (as in https://github.com/ipeaGIT/r5r/issues/181),gtfs2gps
does expandtrip_id
s and to match the appropriate number of frequency-delineated trips, and thus there are many repeated trips (repeated pairs of (trip_id
,stop_sequence
)) in$stop_times
. This causes errors in feed validation using google/transitfeed, which indicatesTimetravel detected!
.A loose guess on my end is that new
trip_id
s need to be created, one for each trip, before this is combined with thefrequencies
ingtfs2gps
's conversion, but I'm not sure.