Unnecessary traverse modes slow down routing

optionsome commented 5 years ago

We observed with HSL GTFS data that having FERRY traverse mode in requests slowed down routing by over 10% and resulted in less itineraries returned on average compared to requests with no FERRY mode. In HSL data, only time FERRY is actually useful is when traveling to one set of islands that are inaccessible otherwise and making a transfer on those islands never makes sense. Therefore, we implemented something in our UI that checks if origin, destination or an intermediate place is inside one of those islands and then we have FERRY on the request sent to OTP. Otherwise FERRY mode is not present in the requests. Would it be worth it to implement a similar feature in OTP as most likely there are other GTFS datasets that have similar special cases with FERRY or some other traverse mode? There also might be other ways to improve the effiency of the routing in these situations.

Expected behavior

Traverse modes that are not relevant for the route search do not considerably slow down the routing.

Observed behavior

Traverse modes that are not relevant for the route search do slow down the routing.

Version of OTP used (exact commit hash or JAR name)

1.4

Data sets in use (links to GTFS and OSM PBF files)

GTFS data: http://dev.hsl.fi/gtfs/hsl.zip OSM data: http://dev.hsl.fi/osm.hsl/hsl.osm.pbf

Command line used to start OTP

java -jar -Xmx8G target/otp-1.4.0-SNAPSHOT-shaded.jar --server --port 8888 --securePort 8889 --basePath ./ --graphs ./graphs --router hsl

Router config and graph build config JSON

router-config.json

{
  "routingDefaults": {
      "walkSpeed": 1.3,
      "transferSlack": 120,
      "maxTransfers": 4,
      "waitReluctance": 0.95,
      "waitAtBeginningFactor": 0.7,
      "walkReluctance": 1.75,
      "stairsReluctance": 1.65,
      "walkBoardCost": 540,
      "walkOnStreetReluctance": 1.5,
      "carParkCarLegWeight": 2,
  },
  "updaters": [
  ]
}

build-config.json

{
  "areaVisibility": true,
  "staticParkAndRide": false,
  "parentStopLinking": true
}

Steps to reproduce the problem

Make requests with 'BUS,FERRY,RAIL,SUBWAY,TRAM,WALK' modes and don't include origin, destination or an intermediate place inside of Suomenlinna area vs same requests with 'BUS,RAIL,SUBWAY,TRAM,WALK' modes

t2gran commented 5 years ago

I am not sure, when we are done with #2626 (in progress) the transit routing will be much faster, also existing features like filtering on modes will be done differently. I am not sure what the cause of this performance issue is - but would not spend time on it now. In your case, I guess the percentage of ferry routes (compared to the total number of routs) is very small, so it won't effect the performance much.

We are trowing away most of the transit routing optimizations also - our goal is to make transit routing 10 times faster.

t2gran commented 5 years ago

I have plan to test out one optimization on the Raptor with prune the graph using a special version of RangeRaptor before running Multi-criteria Range Raptor, this will prune away routes witch are not on a feasible path - independently of modes.

dgoujard commented 5 years ago

@t2gran Hello, have you more details about the test or ETA for raptor algorithm ? We need to chose a router and OTP had strange results (and pagination). When we use Navitia we had better consistency

t2gran commented 5 years ago

@dgoujard I hope we will manage to release the 2.0 before April 2019 (before the OTP Summit in Oslo), also I hope we can have have a beta with limited functionality before Christmas. What do you mean by "pagination"?

dgoujard commented 5 years ago

@t2gran Hello, thank you for your reply and the ETA. By pagination i mean next or previous solutions links. Here you can see some examples : capture d ecran 2018-11-30 a 09 02 26 when i click on next i got this results : You can see that the firsts two solutions are the same in both pages, only the third is different. I tested multiple GTFS data and i still got this issue, i think it's relative to the pagination/query

Thank you

abyrd commented 5 years ago

I am not sure, when we are done with #2626 (in progress) the transit routing will be much faster [...] We are trowing away most of the transit routing optimizations also - our goal is to make transit routing 10 times faster.

I agree with @t2gran's perspective here. The current transit routing system is far from optimal, so we don't want to spend much time and future maintenance effort making piecewise improvements. Work is already underway to replace it entirely.

As for this particular fix: it does make sense that searches containing long distance modes leading to isolated areas of the network would be somewhat slower, because they branch more. This could be addressed with an ad-hoc method like this one built on practical observations of how specific modes behave. But it is related to a general strategy appearing in many more optimized routing algorithms: do a lot of pre-calculation to determine which edges or nodes have a characteristic of "betweenness", and confine the search to the region of the graph that can actually contain the solution. This kind of thing works very well for computation speed, but makes the code complex and tends to rely on having fixed edge weights (which we don't have if we allow real-time updates).

The question is whether building up a collection of ad-hoc optimizations, which will need to be understood separately and in combination by any future maintainers, is worth a 10% speedup. It might seem like a reasonable trade-off when using the current transit routing method where 10% could be a perceptible amount of time, but it probably won't be a good trade-off when using a faster Raptor based routing core.

I really like the idea of using a dead-simple routing method like Raptor with relatively few ad-hoc optimizations added (only optimizations proceeding from general principles that can be more easily maintained). This seems to be the route we're taking in the OpenTripPlanner project. So @optionsome we might close this ticket - it's a good observation but will be addressed through other means.

abyrd commented 5 years ago

@dgoujard I also hope we can achieve the schedule given by @t2gran - having a working Raptor based OTP 2.x branch working by the beginning of April. This will be a completely different approach to current OTP at least for the transit portion of the search, so the variety and quality of results returned could be completely different.

What you are calling "pagination" (the next/previous buttons) are not really "pages" - they actually retrigger a search that should give results immediately before or after the displayed results in time. Clearly your results do not exhibit this characteristic, so there is an error somewhere. You may want to check the console log messages to see if OTP is reporting some unusual situation while performing these secondary searches.

optionsome commented 5 years ago

I agree that making these kinds of optimizations most likely will just complicate the codebase if the routing becomes significantly faster overall.

Once the Raptor based OTP 2.x is done, we will end up testing if it is worth it to keep the existing optimization we made. We can report our findings here if this issue is still open by then.

abyrd commented 5 years ago

It will always be interesting to have systematic performance and resource consumption comparisons between the new code and old code. So I don't mind leaving this open until we can post the empirical comparisons for the historical record.

abyrd commented 3 years ago

Routing has been completely replaced in OTP2. We should create a new issue if this problem persists in OTP2.

opentripplanner / OpenTripPlanner