ipeaGIT / r5r

https://ipeagit.github.io/r5r/
Other
178 stars 27 forks source link

Same travel time output for two different modes #200

Closed Onnene closed 2 years ago

Onnene commented 3 years ago

Hi, using the code snippet below I have a situation where I get the same travel time result for TRANSIT and WALK are the same. Are you able to point me in the right direction as to why this occurs? Similar behaviour is experienced when the modes are initially defined as an array then assigned to the mode variable within the travel_time_matrix function. I am happy to provide additional information if you need it.

...

# calculate a travel time matrix

ttmw <- travel_time_matrix(r5r_core = r5r_core,
                          origins = points,
                          destinations = points,
                          mode = "WALK",
                          departure_datetime = departure_datetime,
                          max_walk_dist = md,
                          max_trip_duration = mtd,
                          verbose = FALSE)

ttmt <- travel_time_matrix(r5r_core = r5r_core,
                          origins = points,
                          destinations = points,
                          mode = "TRANSIT",
                          departure_datetime = departure_datetime,
                          max_walk_dist = md,
                          max_trip_duration = mtd,
                          verbose = FALSE)

...
mvpsaraiva commented 3 years ago

Hi @Onnene. When you input mode = "TRANSIT", r5r actually considers mode = c("WALK", "TRANSIT") because walking between origin/destination points and transit stops needs to be accounted for. Because of that, it's possible that a query of travel times by transit includes some direct walking trips if that's faster than waiting for a transit service.

Having said that, what I think is going on is that you're getting walking only trips from both travel_time_matrix() calls, which can happen if the value passed to departure_datetime is outside the date range of the input GTFS data. So, just check if your GTFS covers the period you're querying for and let us know if that's the problem. I hope this helps.

Onnene commented 3 years ago

Hello, @mvpsaraiva, thanks for your very quick response. Indeed my departure_datetime was outside the date range of the GTFS file. However, the results are still the same after I correct the date.

mvpsaraiva commented 3 years ago

That was my best guess, but I can't think of anything else without a reproducible example. Can you provide one?

Onnene commented 2 years ago

@mvpsaraiva, sorry for the late reply, I was away. Please see the example below

options(java.parameters = "-Xmx2G")

library(r5r)
library(sf)
#above line not required for now

library(data.table)

data_path <- system.file("...", package = "r5r")

#debugging line to check its running correctly
list.files(data_path)

points <- fread(file.path(data_path, "Zonal_Centroid.csv"))

#debugging line to check its running correctly
head(points)

# Indicate the path where OSM and GTFS data are stored
r5r_core <- setup_r5(data_path = data_path, verbose = FALSE)

# set inputs "23-11-2020 14:00:00"  "15-04-2019 14:00:00"   "Etc/GMT-2"   "Africa/Kigali"   "America/Sao_Paulo"
# departure_datetime <- as.POSIXct("01-01-2014 07:30:00",format = "%d-%m-%Y %H:%M:%S",tz="Africa/Nairobi")
departure_datetime <- as.POSIXct("17-12-2020 09:30:00",format = "%d-%m-%Y %H:%M:%S", tz="Africa/Kigali")

# sets the maximum walking distance in meters (to allow a cap to the walking distance)
md <- 3000

# sets the maximum trip duration in minutes (to allow a cap to the trip duration)
mtd <- 60

# mode <- c("WALK", "TRANSIT", "BICYCLE", "CAR", "MOTORBIKE")

# calculate a travel time matrix
ttmw <- travel_time_matrix(r5r_core = r5r_core,
                          origins = points,
                          destinations = points,
                          mode = "WALK",
                          departure_datetime = departure_datetime,
                          max_walk_dist = md,
                          max_trip_duration = mtd,
                          verbose = FALSE)

#debugging line which show a few lines of the travel time output per mode
head(ttmw)

# calculate a travel time matrix
ttmt <- travel_time_matrix(r5r_core = r5r_core,
                          origins = points,
                          destinations = points,
                          mode = "TRANSIT",
                          departure_datetime = departure_datetime,
                          max_trip_duration = mtd,
                          verbose = FALSE)

head(ttmt)

# calculate a travel time matrix
ttmb <- travel_time_matrix(r5r_core = r5r_core,
                           origins = points,
                           destinations = points,
                           mode = "BICYCLE",
                           departure_datetime = departure_datetime,
                           max_trip_duration = mtd,
                           verbose = FALSE)

head(ttmb)

# calculate a travel time matrix
ttmc <- travel_time_matrix(r5r_core = r5r_core,
                           origins = points,
                           destinations = points,
                           mode = "CAR",
                           departure_datetime = departure_datetime,
                           max_trip_duration = mtd,
                           verbose = FALSE)

head(ttmc)

# calculate a travel time matrix
ttbs <- travel_time_matrix(r5r_core = r5r_core,
                           origins = points,
                           destinations = points,
                           mode = "BUS",
                           departure_datetime = departure_datetime,
                           max_trip_duration = mtd,
                           verbose = FALSE)

head(ttbs)

setwd(".../R_Data/output")
write.csv(ttmw,'TT_Walk.csv')
write.csv(ttmb,'TT_Bicycle.csv')
write.csv(ttmc,'TT_Car.csv')
write.csv(ttmt,'TT_Transit.csv')
write.csv(ttbs,'TT_Bus.csv')

#closes r5r and stops using memory
stop_r5(r5r_core)

#restore the gb limit to its default
rJava::.jgc(R.gc = TRUE)
rafapereirabr commented 2 years ago

Dear @Onnene , thank you for providing us with the GTFS data on issue #207 . Nonetheless, we need a fully reproducible example to look into this issue. Could you share with us the osm.pbf data and the Zonal_Centroid.csv file?

Onnene commented 2 years ago

@rafapereirabr thanks for replying. Yes, I am willing to share, however, my attempts to upload the files (or the zipped version of it) here fail with an error "is not included in the list". Do you mind if I sent the files to your email?

rafapereirabr commented 2 years ago

Hi @Onnene Obiora . I think I've found the source of the problem here.

The public transport network is quite small. However, your origin/destination points are quite far from one another. This means that most of the trip will be done walking. And because you put a cap of only 60 minutes for the maximum trip duration, r5r only returns estimates for a few trips between 76 origin-destinations pairs that are really close, and for which it would be faster to walk. Hence the travel time estimates by WALK and TRANSIT are the same for these 76 pairs. Rplot

Now, if you simply increase the maximum trip duration, the result becomes different. In one example, I used a maximum of 360 minutes. This is the result I get:

# sets the maximum trip duration in minutes (to allow a cap to the trip duration)
mtd <- 360

[...]

nrow(ttmt)
> [1] 993

nrow(ttmw)
>[1] 993

# check difference
tt_all <- dplyr::left_join(ttmt, ttmw, by = c("fromId", "toId"))
plot(tt_all$travel_time.x, tt_all$travel_time.y)

Rplot01_diff

In summary, the points are very far from each other, so you need to consider longer maximum trip times. Moreover, the public transport network is fairly small so it's important to keep in mind that itwill only make a difference few origin-destination pairs.

I hope this is helpful.

Onnene commented 2 years ago

Hello @rafapereirabr, many thanks for this, it makes a lot of sense.

rafapereirabr commented 2 years ago

I'm glad we were able to help. I'm closing this issue for now, but please let us know if you find any other problems.

franciscopasqual commented 2 years ago

Hi, I'm running into the same problem for walk and transit results...

The analysis is for Porto Alegre and I'm using the default .pbf and GTFS files provided in the package. The files with the OD pairs are made of hexagons that cover the entire city surface, and I suspect that maybe the provided .pbf is only for some neighborhoods of the city, could that be the problem?

The bit of my code: ... access_TP_empregos_60min <- accessibility(r5r_core = r5r_core, origins = hex_todos, destinations = hex_empregos, opportunities_colname = "oport", decay_function = "step", cutoffs = 61, mode = c("TRANSIT"), verbose = FALSE)

access_walk_empregos_60min <- accessibility(r5r_core = r5r_core, origins = hex_todos, destinations = hex_empregos, opportunities_colname = "oport", decay_function = "step", cutoffs = 61, mode = c("WALK","TRANSIT), verbose = FALSE)

The .csvs I'm using: hex_todos.csv hex_empregos.csv

Thanks in advance!

mvpsaraiva commented 2 years ago

Hi @franciscopasqual.

The analysis is for Porto Alegre and I'm using the default .pbf and GTFS files provided in the package. The files with the OD pairs are made of hexagons that cover the entire city surface, and I suspect that maybe the provided .pbf is only for some neighborhoods of the city, could that be the problem?

That's correct, the provided .pbf dos not cover the entire city area. The sample GTFS provided is not complete as well, because we had to remove many trips from it. We can't provide full datasets in the package because CRAN has a size limit of 5MB per package, and we are already very close to that limit. You can check the extents of the provided .pbf with the following code:

library(r5r)
library(ggplot2)

# build transport network
data_path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path = data_path, verbose = FALSE)

# extract road network
road_network <- street_network_to_sf(r5r_core)

# plot road network
ggplot(data = road_network$edges) + geom_sf()

You can follow the instructions here to prepare a .pbf for the entire study area, and you can download the full GTFS of Porto Alegre from here.

franciscopasqual commented 2 years ago

Hi @mvpsaraiva thanks for the quick reply!

Just some finanl doubts: about creating the .pbf for the entire study area, what is the use of the .mapdb and .mapdb.p files that come together in the package? Do I need those as well or a simple .pbf should work? And the "network.dat" file accounts for the topography? Is that necessary or only if I want to consider it?

Thanks in advance,

mvpsaraiva commented 2 years ago

You only need the .pbf file. The .mapdb, .mapdb.p and network.dat are created by R5 the first time you run setup_r5. If you delete those, R5 will create them again the next time. To account for the topography, you need a .tif raster file with elevation data. But this is optional, you can safely ignore it.

franciscopasqual commented 2 years ago

Hi @mvpsaraiva

I managed to download a .pbf that covers the entire city and the full GTFS too, but it still returns the same outputs for walk and transit, even for 60 minutes cutoffs and with the departure_datetime inside the range of the GTFS.

...

departure_datetime <- as.POSIXct("13-05-2019 14:00:00", format = "%d-%m-%Y %H:%M:%S") max_walk_dist = 2000 max_trip_duration = 60

access_TP_empregos_60min <- accessibility(r5r_core = r5r_core, origins = hex_todos, destinations = hex_empregos, opportunities_colname = "oport", decay_function = "step", cutoffs = 61, mode = c("TRANSIT"), verbose = FALSE)

access_walk_empregos_60min <- accessibility(r5r_core = r5r_core, origins = hex_todos, destinations = hex_empregos, opportunities_colname = "oport", decay_function = "step", cutoffs = 61, mode = c("WALK"), verbose = FALSE)

The .csvs, GTFS and .pbf can be found in this folder, in case you want to check it out: https://drive.google.com/drive/folders/1fbzvWZIExPGwnf-iNpEAlJmKSDLkDQet?usp=sharing

Now I really have no idea of what can be happening, but thanks a lot again!

mvpsaraiva commented 2 years ago

You are creating a variable called departure_datetime, but you are not passing its value to the accessibility function.

Just change the function call to the code below and it should work:

access_TP_empregos_60min <- accessibility(r5r_core = r5r_core,
                                          origins = hex_todos,
                                          destinations = hex_empregos,
                                          departure_datetime = departure_datetime,
                                          opportunities_colname = "oport",
                                          decay_function = "step",
                                          cutoffs = 61,
                                          mode = c("TRANSIT"),
                                          verbose = FALSE)
franciscopasqual commented 2 years ago

Hi Marcus! Thanks a lot, now it worked :) Since I had used the travel_time_matrix function before I thought that the departure_datetime was already specified for whatever came on after. Thanks!