Snapped distance not being calculated as crow flies "walking" distance (r5r)

Mponkane commented 8 months ago

Brief description of the problem:

Here is an issue I found while using r5r, perhaps an upstream R5 issue? Originally posted: https://github.com/ipeaGIT/r5r/issues/373#issue-2156986527

When origin is snapped to network, it does not account for the crow flies distance from the original location. See picture below.

issue

The travel time matrices result into 0 even if I increase the distance of the origin point up to 1.6 km from the network. Both origin and destination are in WGS 84

here is the ttm result: from_id to_id travel_time_p50 1: origin destination 0

here is find_snap for origin: point_id, lat, lon, snap_lat, snap_lon, distance, found 1: origin, 65.05129, 25.42877, 65.04899, 25.42855, 255.8379, TRUE

here is find_snap for destination: point_id, lat, lon, snap_lat, snap_lon, distance, found 1: destination, 65.04884, 25.42872, 65.04883, 25.42872, 1.176363, TRUE

Reproducible example here

options(java.parameters = "-Xmx20G")
library(r5r)
library(sf)
library(dplyr)
library(ggplot2)

r5r_core <- setup_r5("path")
origin <- st_read("test_origin.gpkg")
destination <- st_read("test_dest.gpkg")

det <- detailed_itineraries(
  r5r_core,
  origins = origin,
  destinations = destination,
  mode = "BICYCLE",
  max_trip_duration = 30,
  max_lts = 2,
  bike_speed = 15,
  walk_speed = 2,
  shortest_path = TRUE,
  all_to_all = TRUE
)

ttm <- travel_time_matrix(
  r5r_core,
  origins = origin,
  destinations = destination,
  mode = "BICYCLE",
  max_trip_duration = 30,
  max_lts = 2,
  bike_speed = 15,
  walk_speed = 2
)

snap_df <- find_snap(r5r_core, origin, mode = "BICYCLE")

Situation report

$r5r_package_version [1] ‘1.1.0’

$r5_jar_version [1] "6.9"

$java_version [1] "11.0.18"

$set_memory [1] "-Xmx20G"

$session_info R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22631)

Matrix products: default

locale: [1] LC_COLLATE=English_Finland.utf8 LC_CTYPE=English_Finland.utf8 LC_MONETARY=English_Finland.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Finland.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] readr_2.1.4 ggplot2_3.4.4 r5r_1.1.0 dplyr_1.1.3 sf_1.0-14

loaded via a namespace (and not attached): [1] Rcpp_1.0.11 pillar_1.9.0 compiler_4.2.2 class_7.3-20 tools_4.2.2 bit_4.0.5 lifecycle_1.0.4
[8] tibble_3.2.1 gtable_0.3.4 checkmate_2.3.1 pkgconfig_2.0.3 rlang_1.1.1 DBI_1.1.3 cli_3.6.0
[15] rstudioapi_0.15.0 parallel_4.2.2 rJava_1.0-10 e1071_1.7-14 withr_2.5.2 sfheaders_0.4.3 generics_0.1.3
[22] vctrs_0.6.4 hms_1.1.3 bit64_4.0.5 classInt_0.4-10 grid_4.2.2 tidyselect_1.2.0 glue_1.6.2
[29] data.table_1.14.6 R6_2.5.1 fansi_1.0.4 vroom_1.6.5 tzdb_0.4.0 magrittr_2.0.3 backports_1.4.1
[36] scales_1.3.0 units_0.8-5 colorspace_2.1-0 utf8_1.2.2 KernSmooth_2.23-20 proxy_0.4-27 munsell_0.5.0
[43] crayon_1.5.2

rafapereirabr commented 8 months ago

Hi all. This code snippet in R5 seems to indicate that R5 uses a fixed (hard coded) "off-road" speed (from the specified point to the snapped location) of 1.3 m/s. Is this correct?

abyrd commented 6 months ago

Hi. In Conveyal's use of R5, we call TravelTimeComputer#computeTravelTimes, which then calls StreetRouter#setOrigin for each requested pre-transit mode of travel. On or around StreetRouter.java line 340, we compute int offStreetTime = split.distanceToEdge_mm / OFF_STREET_SPEED_MILLIMETERS_PER_SECOND using the constant that @rafapereirabr linked to. A few lines later around L344 and L350, the initial state is set when we begin routing to include any travel time along the street or perpendicular to the street. See: https://github.com/conveyal/r5/blob/v7.2/src/main/java/com/conveyal/r5/streets/StreetRouter.java#L344

That's the access-to-network end of the process. For egress from the network, the corresponding logic is in LinkedPointSet. See: https://github.com/conveyal/r5/blob/v7.2/src/main/java/com/conveyal/r5/streets/LinkedPointSet.java#L616

In both cases, offStreetTime is added into the travel time result.

R5 is primarily intended for use in analyzing the represented transportation network (with a focus on public transit networks), so the overall routing process assumes separate steps for "reaching the network" and "exiting the network". Over extremely short distances this could lead to overestimating travel times because the traveler must at least go to a street and then from that street to any destination. But I would not expect this to underestimate travel times, only overestimate them when the destination is closer to the origin than to a street.

It is possible that R5R is not using these same methods to perform routing, and one of these steps is being bypassed. I'll take a quick look at the R5R source and see if I can spot the corresponding logic.

ansoncfit commented 6 months ago

It appears the origin and destination in this example are linked to the same edge. Travel times were not handled correctly for such cases until #586.

There is a comment suggesting the origin-to-split-point distance is not included in these cases: https://github.com/conveyal/r5/blob/a96f49a254fc827202ebb18d24cea684d497e453/src/main/java/com/conveyal/r5/streets/LinkedPointSet.java#L450-L454

but I would not expect 0 travel time (unless there is a rounding issue).

abyrd commented 6 months ago

Thanks for that reference @ansoncfit. I hadn't noticed that the r5_jar_version in the report was 6.9, and looked only at current code. But that R5 JAR version 6.9 is from only about a year ago in early 2023, the PR you cited #586 is from three years earlier in 2020, and the code comment you referenced about missing off-street time is still in the current source code.

So apparently when the update was made to handle the special case of travel times entirely along one edge (relatively recently in the long history of R5), something blocked or delayed addressing the detail of the initial off-street time.

Just to share what I've been looking at so far in case it helps anyone else explore how R5R is handling this: R5R appears to use org.ipea.r5r.Process.TravelTimeMatrixComputer#runProcess and org.ipea.r5r.Process.FastDetailedItineraryPlanner#runProcess which call their own buildRequest methods, which in turn call buildRequest on their shared superclass org.ipea.r5r.Process.R5Process. This appears to work very similarly to how we use R5, with a FreeFormPointSet, RegionalTask, and TravelTimeComputer producing a OneOriginResult. However, the travel time computer used is an R5R-specific class called R5TravelTimeComputer which extends our TravelTimeComputer, but includes a large amount of duplicated code in org.ipea.r5r.R5.R5TravelTimeComputer#computeTravelTimes. Due to the duplication of so much essential logic, it seems like there's strong potential here for the process to diverge from that of R5 itself. Looking at the history of this file via https://github.com/ipeaGIT/r5r/blob/master/java-r5rcore/src/org/ipea/r5r/R5/R5TravelTimeComputer.java it looks like the whole class may have been replicated just to change one particular behavior. The commit message is: "Fixing bug in TravelTimeComputer where R5 always considered max_walk_dist = 2000 when using a Fare Calculator." @rafapereirabr @mvpsaraiva is there a chance we could solve this problem without subclassing TravelTimeComputer? It could probably save us all some maintenance time in the future.

abyrd commented 6 months ago

We looked into this more yesterday. In the code handling the case where origin and destination are on the same road segment, there was an outstanding concern about consistency in how the initial walk term was calculated. For this reason it remained commented out pending review of some other code and a final decision. The new code would use the walk speed specified in the request, while the existing, more common case where the origin and destination are not on the same road segment uses a constant speed (as pointed out in a comment above). This constant speed has remained in use for a long time for reasons of backward compatibility and consistency of results. We wanted to switch to applying user-specified speed in all cases, but before doing so needed to verify that the user-specified speed was readily available and would not be baked into any pre-computed values, and also wait for a major release to introduce the change. But rather than temporarily using the constant walk speed, it looks like the term was just left out. In the next release we will aim to consistently use either the constant or user-specified walk speed for all initial access-to-road segments.

rafapereirabr commented 6 months ago

Hi @ansoncfit and @abyrd , thanks so much for the careful investigation. This is much appreciated. I'll open an issue in {r5r} related to Andrew's question above regarding the code redundancy in {r5r} TravelTimeComputer.

conveyal / r5