Open rafapereirabr opened 2 years ago
The gtfs2gps
function creates new coordinates of stop_sequences
, and adds extra points in order to match with the last coordinate of gtfs$shapes
.
In this print blow, you can see that the last coordinate of gtfs$shapes
do not match with the last coordinate of gtfs$stops
.
In the map, you can see that is very very close, but not the same.
However, even if tail(gps,1) == tail(gtfs$shapes,1)
, there will be NA
s added — because of this behavior of gtfs2gps::gtfs2gps
that creates new points for the stops. In this reprex below, I changed the last stop_id
coordinates in order to match the last line of gtfs$shapes
. The results are similar, because the stop_sequences
coordinates are no longer the same as in gtfs$stop_id
.
gtfs_file <- system.file("extdata/irl_dub/irl_dub_gtfs.zip", package = "gtfs2emis")
# read GTFS
gtfs <- gtfstools::read_gtfs(gtfs_file)
# Keep Monday services GTFS
gtfs <- gtfstools::filter_by_weekday(gtfs,
weekday = c('saturday', 'sunday'),
keep = FALSE)
# filter trip
id <- '6343.2.60-1-b12-1.1.O'
gtfs <- gtfstools::filter_by_trip_id(gtfs, trip_id = id )
# last stop equal to last shapes (lat,long)
gtfs$stops[.N,stop_lat := gtfs$shapes[.N,shape_pt_lat]]
gtfs$stops[.N,stop_lon := gtfs$shapes[.N,shape_pt_lon]]
# convert to gps
gps <- gtfs2gps(gtfs)
tail(gps)
gps_sf <- gtfs2gps::gps_as_sflinestring(gps)
I can think of few strategies to solve this problem:
1) Not replacing the input stop_id
s coordinates: by doing this we will no longer have this last NA
(if, and only if, tail(gps,1) == tail(gtfs$shapes,1)
)
2) Use some sort of tolerance
in gtfs2gps
: For instance, if the snapped point is within a certain distance (say 5
meters) of the input coordinates, we will not change the input value.
However, I don't know exactly how difficult would be to implement such solutions.
Maybe we could only improve the message. Currently it says that:
paste0(na_values, " 'speed' values are NA for shape_id '", shapeid, "'.")
Possibly we could also say that such values are (i) in the beginning, (ii) in the end, or (iii) in different parts of the shape. We could also remove the message Some 'speed' values are NA in the returned data.
as the previous messages are more informative.
Perhaps this message should only be printed if there are two or more NAs
in the output of the shape.
I'm finding a strange behavior in
gtfs2gps()
. In the reprex below, I filter a single trip and convert it to a GPS-like table. The problem is that thegtfs2gps()
function prints a message saying'speed' values are NA for shape_id '60-1-b12-1.1.O'
. This message seems to suggest all speed values in this trip areNA
, but they are not. This seems to be a problem in the code. The message should not be printed in this case, right?The function is able to calculate the speed correctly, as seen in the outupt. There is only one
NA
in the last segment (as expected). So the function also prints the messageSome 'speed' values are NA in the returned data.
. As a rule, there will always be oneNA
in the last trip segment, right? So perhaps this message is unecessary. What do you guys think?reprex