ipeaGIT / r5r

https://ipeagit.github.io/r5r/
Other
178 stars 27 forks source link

Setting departure times and dealing with time zones #188

Closed jamesdeweese closed 3 years ago

jamesdeweese commented 3 years ago

Wondering how time zones are dealt with when passing departure times to r5r. I'm running a travel-time matrix for Portland, Oregon, but I'm located in Montreal. My computer is set to Montreal time (3 hours ahead). I want to be certain that when I select a 7 a.m. (07:00) departure time when running travel_time_matrix() or detailed_itineraries(), that I'm getting 7 a.m. clock time in the area being studied--in this case, Portland.

Are POSIXct date-time objects being converted to clock time automatically based on the GTFS time zone? Is it based on the computer's system time? Or does the time zone need to be set manually when creating the departure_datetime object using, e.g., tz = "America/los_angeles", when the study area and computer's setting are different?

I wasn't entirely sure from the documentation and two previous issues related to time zones that I saw on GitHub. When I tested by manually setting different time zones, I get precisely the same travel time matrix. I would expect them to yield different results. I retested it with the built-in dataset and example from the Intro. Same unexpected result. I've pasted that reproducible code below.

options(java.parameters = "-Xmx10G")

library(r5r)
library(sf)
library(data.table)
library(ggplot2)
library(mapview)
mapviewOptions(platform = 'leafgl')

data_path <- system.file("extdata/poa", package = "r5r")

points <- fread(file.path(data_path, "poa_hexgrid.csv"))

set.seed(101)
points <- points[ c(sample(1:nrow(points), 100, replace=TRUE)), ]

r5r_core <- setup_r5(data_path = data_path, verbose = FALSE)

mode <- c("RAIL")
max_walk_dist <- 5000
max_trip_duration <- 120
departure_datetime_paris <- as.POSIXct("13-05-2019 14:00:00",
                                 format = "%d-%m-%Y %H:%M:%S", tz = "Europe/Paris")

ttm_paris_test <- travel_time_matrix(r5r_core = r5r_core,
                          origins = points,
                          destinations = points,
                          mode = mode,
                          departure_datetime = departure_datetime_paris,
                          max_walk_dist = max_walk_dist,
                          max_trip_duration = max_trip_duration,
                          verbose = FALSE)

departure_datetime_mtl <- as.POSIXct("13-05-2019 14:00:00", 
                                 format = "%d-%m-%Y %H:%M:%S", tz = "America/Montreal")

ttm_mtl_test <- travel_time_matrix(r5r_core = r5r_core,
                                   origins = points,
                                   destinations = points,
                                   mode = mode,
                                   departure_datetime = departure_datetime_mtl,
                                   max_walk_dist = max_walk_dist,
                                   max_trip_duration = max_trip_duration,
                                   verbose = FALSE)

identical(ttm_paris_test, ttm_mtl_test)
identical(departure_datetime_mtl, departure_datetime_paris)

Both travel-time matrices are identical, but the departure_datetimes are not.

rafapereirabr commented 3 years ago

Hi James. Thanks for opening this issue. It seems that R5 uses the timezone of the first agency in the GTFS and overrides the timezone set by the user (code), but perhaps @mvpsaraiva could confirm this.

In any case, this needs to be clarified in the r5r documentation.

mvpsaraiva commented 3 years ago

Hi James. We have dealt with this issue in the past, and you can see an example here. @rafapereirabr is right about R5's approach, but there was a bug in the very early versions of r5r that caused time zones to be mixed up. Our approach to fix it was to always use the time zone of the study area, regardless of the time settings of the user's computer, thus in line with R5. In your case, 8 am always will mean 8 am in Portland.

If I remember well, @dhersz is the one who fixed this bug at the time.

dhersz commented 3 years ago

That's exactly it, r5r strips the date and time fields from a datetime and use them as if they were set to the study area's timezone. posix_to_string() is responsible for this under the hood:

r5r:::posix_to_string(as.POSIXct("13-05-2019 14:00:00", format = "%d-%m-%Y %H:%M:%S"))
#> $date
#> [1] "2019-05-13"
#> 
#> $time
#> [1] "14:00:00"

r5r:::posix_to_string(as.POSIXct("13-05-2019 14:00:00", format = "%d-%m-%Y %H:%M:%S", tz = "America/Montreal"))
#> $date
#> [1] "2019-05-13"
#> 
#> $time
#> [1] "14:00:00"

Which means that in @jamesdeweese 's case it's using Portland timezone, not Montreal.

Perhaps the easiest way of setting the timezone based on your own timezone is using lubridate::with_tz():

portland_dt <- lubridate::with_tz(
  as.POSIXct("13-05-2019 14:00:00", format = "%d-%m-%Y %H:%M:%S", tz = "America/Montreal"),
  tzone = "America/Los_Angeles"
)
r5r:::posix_to_string(portland_dt)
#> $date
#> [1] "2019-05-13"
#> 
#> $time
#> [1] "11:00:00"

I'll adjust the documentation to reflect this.

dhersz commented 3 years ago

Done in https://github.com/ipeaGIT/r5r/commit/1ace26053cb6ac32b2bb4bae78e637174c41f361.

The documentation of travel_time_matrix(), detailed_itineraries() and accessibility() now reads:

...
#' @param departure_datetime POSIXct object. If working with public transport
#'                           networks, please check \code{calendar.txt} within
#'                           the GTFS file for valid dates. See details for
#'                           further information on how datetimes are parsed.
...
...
#' # Datetime parsing
#'
#' `r5r` ignores the timezone attribute of datetime objects when parsing dates
#' and times, using the study area's timezone instead. For example, let's say
#' you are running some calculations using Rio de Janeiro, Brazil, as your study
#' area. The datetime `as.POSIXct("13-05-2019 14:00:00",
#' format = "%d-%m-%Y %H:%M:%S")` will be parsed as May 13th, 2019, 14:00h in
#' Rio's local time, as expected. But `as.POSIXct("13-05-2019 14:00:00",
#' format = "%d-%m-%Y %H:%M:%S", tz = "Europe/Paris")` will also be parsed as
#' the exact same date and time in Rio's local time, perhaps surprisingly,
#' ignoring the timezone attribute.
...
jamesdeweese commented 3 years ago

Thanks @dhersz @mvpsaraiva @rafapereirabr! Really appreciate it. This answers my question exactly. And I think the explanation on datetime parsing in the docs is clear. Thanks again.

dhersz commented 3 years ago

Thank you for filing the issue and for the feedback!