UrbanAnalyst / gtfsrouter

Routing and analysis engine for GTFS (General Transit Feed Specification) data
https://urbananalyst.github.io/gtfsrouter/
81 stars 17 forks source link

Isochrones #3

Closed mpadge closed 5 years ago

mpadge commented 5 years ago

Add extra time_range argument that returns all end stations reachable from nominated station within specified time range

mpadge commented 5 years ago

Re-opened to get rid of need to specify start_time. Instead, this function should analyse all isochrones from a given station through the day, and have an option for type of isochrone:

  1. "min" = smallest isochrone reachable throughout the day;
  2. "max" = largest ..;
  3. "mode" (default) = Station reached on a given route more than any other during the day.

There should also be a day parameter implemented as for gtfs_route(). - done

tbuckl commented 5 years ago

@mpadge love this idea

polettif commented 5 years ago

This is indeed a great function, immensly useful!

I'd like to add some thoughts on this, currently gtfs_isochrones returns a data frame like this

   stop_name stop_lon stop_lat in_isochrone
1       Four  7.42447 46.96878         TRUE
2        Two  7.39899 46.96230         TRUE
3        One  7.39072 46.95961         TRUE
5      Three  7.40838 46.96347        FALSE
6      Three  7.40839 46.96348        FALSE

I guess the travel time to each stop is already calculated within this function? Could that be returned as well?

I'd suggest that gtfs_isochrones only returns a data frame with stop_ids and the corresponding travel_times (or two additional cols with departure and arrival time). That's the most basic result that you can work with. To see which stops are in the isochrone, a simple filter/join can be applied. Or if you need the stop coordinates run inner_join(isochrone_table, gtfs$stops, by="stop_id") or sth. like that.

Another note on stop_id vs. stop_name: I figure it's a deliberate decision to only use stop_names on the lowest api level? That's more user friendly since travel times can be aggregatetd meaningfully (nobody cares what the exact arrival time for every platform of a station is). But again, I think aggregating data afterwards (i.e. stop_ids to parent_stations or stop_names) is easier than splitting up combined data you got.

mpadge commented 5 years ago

Great that you're already using it. I also implemented a plot method - just try plotwith the result. (And yes, there are some errors, so the isochrone isn't always correct yet, but I'm working on that). Oh, and the plot method is the reason why all stations are returned, because plotting those gives a useful context to the isochrone, and the only way to implement generic plot method is to have all data in the returned object. (This might change with caching as per #7 , but unsure at this point.)

mpadge commented 5 years ago

The above commit adds the structure to calculate all isochrones throughout the day, but i'm not sure it'll be useful in practice - it's very slow on a real data set. The amount of calculation involved is enormous... Not sure what to do about that?