munterfi / hereR

R package that provides an interface to the HERE REST APIs: Geocoder API, Routing API, Traffic API, Public Transit API and Destination Weather API. Locations and routes are returned as 'sf' objects.
https://munterfi.github.io/hereR/
GNU General Public License v3.0
89 stars 11 forks source link

Timezone conversion is useless...? Here converts the time from REST API to local time #85

Closed e-kotov closed 3 years ago

e-kotov commented 3 years ago

Hi, I have done some testing and some reading and it seems to me that the current approach with .encode_datetime is doing more harm than good. According to the Here API documentation (perhaps this is a new behaviour... and previously it worked in UTC???) :

departureTime specifies the time of departure as defined by either date-time or full-date T partial-time in RFC 3339, section 5.6 (for example, 2019-06-24T01:23:45). The requested time is converted to the local time at origin. When the optional timezone offset is not specified, time is assumed to be local. If neither departureTime or arrivalTime are specified, current time at departure location will be used. All Time values in the response are returned in the timezone of each location.

So when the iso and routing functions convert the time to UTC for sending to the Here REST API, they are simply passing something like 2020-09-17T15:00:00. Here sees no Z and no optional timezone offset. It then thinks that this is local time. So it returns the iso or the route for the incorrect time.

To check that, try your own API key and run:

library(sf)
library(hereR)
# hereR::set_key("YOURKEY")

point <- st_sf(st_sfc(st_point(c(37.52523, 55.64005)), crs = 4326))

area_18 <- isoline(poi = point,
        datetime = as.POSIXct("2020-09-17 18:00:00", tz = "Europe/Moscow"),
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F) %>% st_area()

area_21 <- isoline(poi = point,
        datetime = as.POSIXct("2020-09-17 21:00:00", tz = "Europe/Moscow"),
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F) %>% st_area()

area_no_traffic <- isoline(poi = point,
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = F,
        aggregate = F) %>% st_area()

area_18/area_21
area_21/area_no_traffic
area_18/area_no_traffic

You will get:

> area_18/area_21
1.258702 [1]
> area_21/area_no_traffic
0.3900247 [1]
> area_18/area_no_traffic
0.4909249 [1]

This basically means that an isochrone for 18:00 in Moscow during the heaviest traffic time is 25% larger than at 21:00, which is highly unlikely. I think that supports my argument converting the time to UTC is undesirable and currently leads to incorrect results for all functions (isochrones, routing...).

If you agree with me, I can find some time in the next week or so to try and update the code as necessary and submit a pull request. The two possible solutions, I guess, are:

munterfi commented 3 years ago

Hi, thanks for reporting! I am happy about contributions and also testing if you have time.

Currently the HERE Routing API v7 is used in the package, the link to the documentation is for v8.12.0. I am planning to update to the HERE Routing API v8 in the next weeks. The API documentation for version 7 is here, but the handling of timestamps seems to be similar:

Time when travel is expected to start. Traffic speed and incidents are taken into account when calculating the route, where applicable. You can use now to specify the current time. It can be used only if parameter start is also used. Type: xs:dateTime. departure=2013-07-04T17:00:00+02 When the optional timezone offset is not specified time is assumed to be local.

If I understand your issue correctly, then the returned datetime is in the correct timezone, but the traffic information is shifted in time. Which means a timestamp with a specified optional timezone offset should be passed in the requests to the API to avoid conversion to local time by the API. A rewrite of the .encode_datetime to add the timezone offset:

.encode_datetime <- function(datetime) {
  dt <- format(datetime, "%Y-%m-%dT%H:%M:%S%z")
  stringr::str_replace(
    paste0(
      stringr::str_sub(dt, 1, -3), ":",
      stringr::str_sub(dt, -2, nchar(dt))
    ),
    "\\+", "%2B"
  )
}

# Plus (encoded)
(t = as.POSIXct("2020-09-17 18:00:00", tz = "Europe/Moscow"))
#> [1] "2020-09-17 18:00:00 MSK"
.encode_datetime(t)
#> [1] "2020-09-17T18:00:00%2B03:00"

# Minus
(t = as.POSIXct("2020-09-17 18:00:00", tz = "America/Bogota"))
#> [1] "2020-09-17 18:00:00 -05"
.encode_datetime(t)
#> [1] "2020-09-17T18:00:00-05:00"

The example gives now more reliable results:

library(hereR)
set_verbose(TRUE)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.1, PROJ 6.3.1

point <- st_sf(st_sfc(st_point(c(37.52523, 55.64005)), crs = 4326))

area_18 <- isoline(poi = point,
        datetime = as.POSIXct("2020-09-17 18:00:00", tz = "Europe/Moscow"),
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F) %>% st_area()
#> Sending 1 request(s) to: 'https://isoline.route.ls.hereapi.com/routing/7.2/calculateisoline.json?...'
#> Received 1 response(s) with total size: 18.3 Kb

area_21 <- isoline(poi = point,
        datetime = as.POSIXct("2020-09-17 21:00:00", tz = "Europe/Moscow"),
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F) %>% st_area()
#> Sending 1 request(s) to: 'https://isoline.route.ls.hereapi.com/routing/7.2/calculateisoline.json?...'
#> Received 1 response(s) with total size: 27.6 Kb

area_no_traffic <- isoline(poi = point,
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = F,
        aggregate = F) %>% st_area()
#> Sending 1 request(s) to: 'https://isoline.route.ls.hereapi.com/routing/7.2/calculateisoline.json?...'
#> Received 1 response(s) with total size: 15.6 Kb

area_18/area_21
#> 0.592572 [1]
area_21/area_no_traffic
#> 0.6583742 [1]
area_18/area_no_traffic
#> 0.3901341 [1]

The timezone of datetime values returned by route(), route_matrix(), … should be parsed back to the timezone of the original input datetimes in the request:

datetime_MSK <- as.POSIXct("2020-09-17 18:00:00", tz = "Europe/Moscow")
datetime_CEST <- as.POSIXct("2020-09-17 18:00:00", tz = "Europe/Zurich")

area_18_MSK <- isoline(poi = point,
        datetime = datetime_MSK,
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F)
#> Sending 1 request(s) to: 'https://isoline.route.ls.hereapi.com/routing/7.2/calculateisoline.json?...'
#> Received 1 response(s) with total size: 18.3 Kb

area_18_CEST <- isoline(poi = point,
        datetime = datetime_CEST,
        arrival = F,
        range = 1800,
        range_type = "time",
        type = "fastest",
        mode = "car",
        traffic = T,
        aggregate = F)
#> Sending 1 request(s) to: 'https://isoline.route.ls.hereapi.com/routing/7.2/calculateisoline.json?...'
#> Received 1 response(s) with total size: 18.1 Kb

data.frame(
  request = c(1,2),
  requested_time = c(
    datetime_MSK %>% format("%Y-%m-%dT%H:%M:%S%z"),
    datetime_CEST %>% format("%Y-%m-%dT%H:%M:%S%z")),
  parsed_response = c(
    area_18_MSK$departure %>% format("%Y-%m-%dT%H:%M:%S%z"),
    area_18_CEST$departure %>% format("%Y-%m-%dT%H:%M:%S%z")),
  area = c(area_18_MSK %>% st_area(), area_18_CEST %>% st_area())
)
#>   request           requested_time          parsed_response            area
#> 1       1 2020-09-17T18:00:00+0300 2020-09-17T18:00:00+0300 377560536 [m^2]
#> 2       2 2020-09-17T18:00:00+0200 2020-09-17T18:00:00+0200 455100512 [m^2]

Can you test again with the development version?

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os macOS Catalina 10.15.7 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Zurich #> date 2020-11-23 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.2.0 2020-11-02 [1] CRAN (R 4.0.2) #> callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2) #> class 7.3-17 2020-04-26 [1] CRAN (R 4.0.2) #> classInt 0.4-3 2020-04-07 [1] CRAN (R 4.0.0) #> cli 2.1.0 2020-10-12 [1] CRAN (R 4.0.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.1) #> data.table 1.13.2 2020-10-19 [1] CRAN (R 4.0.2) #> DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2) #> dplyr 1.0.2 2020-08-18 [1] CRAN (R 4.0.2) #> e1071 1.7-4 2020-10-14 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> hereR * 0.5.1.9000 2020-11-23 [1] local #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2) #> KernSmooth 2.23-18 2020-10-29 [1] CRAN (R 4.0.2) #> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> lwgeom 0.2-5 2020-06-12 [1] CRAN (R 4.0.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.4.0 2020-10-07 [1] CRAN (R 4.0.2) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2) #> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> rlang 0.4.8 2020-10-08 [1] CRAN (R 4.0.2) #> rmarkdown 2.5 2020-10-21 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> sf * 0.9-6 2020-09-13 [1] CRAN (R 4.0.2) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 3.0.0 2020-10-31 [1] CRAN (R 4.0.2) #> tibble 3.0.4 2020-10-12 [1] CRAN (R 4.0.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0) #> units 0.6-7 2020-06-13 [1] CRAN (R 4.0.2) #> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2) #> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2) #> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2) #> xfun 0.19 2020-10-30 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```
e-kotov commented 3 years ago

@munterfinger hello again! Thank you for such a quick response! I have tested the develop branch - it seems to be working as expected now.

munterfi commented 3 years ago

Perfect - thanks for the detailed description and investigation of the timezone issue.

A new release of the package with this important bugfix will be submitted to CRAN by the end of this week.