matsim-org / matsim-libs

Multi-Agent Transport Simulation
www.matsim.org
490 stars 450 forks source link

Odd behavior in initial plan routing #1232

Open neuma opened 4 years ago

neuma commented 4 years ago

Short story

Whenever the number of car-links in a network differs from the number of ride-links, the car routes feature all kind of detours.

Long story

In one of our transport studies, we start to observe a somewhat strange behavior of the car users. Agents start to use all kind of detours. The more reasonable direct route is (almost) empty. This behavior had been first observed during post-run analysis but can also be observed right at the start of a run. That is, feeding an initial plan without any route information forces PrepareForSim to reroute the whole plan. The resulting routes then already contain the detours - check the plans file of iteration 0 or the events of iteration 0.

After some extensive debugging, we nailed it down to the network modifications we made. Namely, we added a couple of new links. Those new links featured only car as transport mode - whereas all other links offered car and ride as mode.

Findings so far

Steps to reproduce

The behavior can be observed running the OpenBerlinScenario with MATSim. Namely,

And the following modifications

Running this setup shows the expected normal behavior without any detours.

The following additional modifications to the network trigger the detours

Illustrative example

The remote location along Altensteinstrasse, Berlin features a couple of minor detours. Each detour is longer in distance and features a lower freespeed compared to the more direct route. Nonetheless, all agents stick to the detour. According to Via, the direct route should be the least cost path based on freespeed. Agent 417671401 is one of the affected agents.

More complex detours occur throughout the Berlin area. Most of them seem unrelated to the altered links. However, the Dijkstra may "touch" the altered links during expansion.

Bottom line

Best guess so far is some thread-related interdependency. Any help with digging into this is much appreciated.

neuma commented 4 years ago

Update:

Using the OpenBerlinScenario with a singleThreaded configuration does not mitigate the problem. For smaller in-house scenarios it worked though.

24h counts for link id 41704 in iteration 0 are the same for the original network but vary in all other cases.

Comparison of iteration 0 plans of the two runs with the original network: Identical plans between both runs

Comparison of iteration 0 plans of the two runs with the altered network: The plans differ in the car and ride routes. For some agents, the same route of the same agent differs in its travel time although the links of the route are the same. There are also cases, were the same departure time results in a car leg with a different sequence of links - and a different travel time and travel distance.

sebhoerl commented 4 years ago

How is ride simulated? Can it be an effect of the TravelTimeCalculator? We once had a similar issue when we were using the SBB extension to "emulate" movements of PT vehicles. Because the TravelTimeCalculator was set up to consider all modes together we suddenly were dragging down the estimated travel times, because the busses on the road (same network as car) generated quite fast enter/leave events.

neuma commented 4 years ago

I am not that familiar with the OpenBerlinScenario. With that said, pt runs on a separate network. According to the controler, ride uses the car travel disutility. Ride is routed but teleported during mobsim. Thus, route type for ride legs is links but there are no link enter/leave events.

ikaddoura commented 4 years ago

This is very strange!

I was able to reproduce this behavior in this test: https://github.com/matsim-scenarios/matsim-berlin/blob/5.4.x-removeRideModeOnSomeLinksExperiments/src/test/java/org/matsim/run/RunBerlinScenarioDebugExperimentsIT.java (This class contains tests for a 1pct and 10pct population sample. The problem only appears in the 10pct case.)

neuma commented 4 years ago

Thanks for adding the confirmation.

It may be the case that the 1pct just reduces the probability in the same way as switching to singleThreaded mitigates the issue for some scenarios but not for the 10pct OpenBerlinScenario.

ikaddoura commented 4 years ago

The problem seems to disappear in the 5.5.x matsim-berlin branch which uses the 13.0-SNAPSHOT matsim release, see here: https://github.com/matsim-scenarios/matsim-berlin/blob/5.5.x-removeRideModeOnSomeLinksExperiments/src/test/java/org/matsim/run/RunBerlinScenarioDebugExperimentsIT.java ... in comparison to the link above. (I only had a quick look and couldn't find those minor detours...) Are you able to reproduce the problem in a more recent matsim version?

neuma commented 4 years ago

I could repeat the experiment with the OpenBerlinScenario 10pct v5.5. The issue seems to be still present albeit not that obvious.

Setup

Run 1

Run 2

Run 3

Run 4

Again, for all detours, Via shows the correct least cost path and I could not find any obvious network errors on the bypassed links.

michalmac commented 3 years ago

Is this error already occurring during the planning phase, or something magically happens at later stages of running the whole iteration?

Just asking because all bug reproduction steps are like run matsim for 1 iteration given some config.

kainagel commented 3 years ago

If this is an error that can be deterministically reproduced, could you then please start debugging around that bug? Maybe Ihab's test is a good starting point. Sometimes such problems go away with newer releases, but more often than not they just move somewhere else, and then the software accumulates such smallish errors until we cannot debug them any more.

A starting point, IMO, would be to write the contents of the travel time arrays for the incriminated links into the output.

An alternative would be to run the code in the debugger and set breakpoints accordingly.

Thank you so much...

neuma commented 3 years ago

Is this error already occurring during the planning phase, or something magically happens at later stages of running the whole iteration?

Just asking because all bug reproduction steps are like run matsim for 1 iteration given some config.

@michalmac According to my own debug diary https://github.com/matsim-org/matsim-libs/issues/1232#issuecomment-726081378, the iteration 0 input plans already differ. So, I would assume that the non-deterministic behavior is somewhere in the PrepareForSim (Routing).

vsp-gleich commented 3 years ago

I'm checking for another matsim-berlin 1% setup. Instead of closing links for ride, it deals with service area restricted drt. So, mode drt is added only to car links in the service area instead of being added to all car links as before. In both cases mode drt did not exist on the network before and is added only in the run class. Input plans have no routes.

The links not accessible to drt are different from the links closed to ride as described above. I cannot see odd routes where they occurred in Allensteinstraße Zehlendorf for the above described setup (see image below). However, I would like to exclude that the problem has moved to another place.

@neuma is there an efficient way to check for such odd routes? I visualized link volumes in Via and hoped to find heavy volume routes with detours, i.e. instead of following a direct road, at some point most cars take a detour and therefore link volumes are high on the detour and low on the direct path between two high volume corridors. So far that gave no results. image

neuma commented 3 years ago

Thanks for looking into this! In your snippet I cannot spot any detours. I wonder if this actually applies to drt or only to those modes that somehow link/bind to the car travel disutility but route on their own (sub)network.

Basically, I am doing the same. Fire up Via, pull in the events and the corresponding network. Then Network Settings > Link Attributes > Avg. Speeds and Volumes > Pan and skim through areas that allow for such detours, i.e. Southwest districts.

Scale Factor and/or color scale needs to be adjusted to increase the contrast and hence visibility. 24hour volumes could be checked in an easier way.

ikaddoura commented 3 years ago

some ideas to follow up on this:

mrieser commented 3 years ago

Not sure if it is related, but when working on routing algorithms I think I found a bug in the BinaryMinHeap used in FastDijkstra and FastAStarLandmarks.

Basically, adding the following test to BinaryMinHeapTest results in a test failure:

@Test
public void stresstest() {
    int cnt = 20;
    DummyHeapEntry[] entries = new DummyHeapEntry[cnt];
    double[] cost = new double[cnt];
    Random r = new Random(20190210L);
    for (int i = 0; i < cnt; i++) {
        cost[i] = (int) (r.nextDouble() * cnt * 100);
        entries[i] = new DummyHeapEntry(i);
    }

    BinaryMinHeap<DummyHeapEntry> pq = new BinaryMinHeap<>(cnt, 4, true);

    for (int i = 0; i < cnt; i++) {
        pq.add(entries[i], cost[i]);
    }
    double lastCost = -1;
    int step = 0;
    while (!pq.isEmpty()) {
        step++;
        DummyHeapEntry e = pq.poll();
        double nodeCost = cost[e.index];
        System.out.println(nodeCost);
//      if (lastCost > nodeCost) {
//          System.out.println("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!");
//      }
        Assert.assertTrue(step + ": " + lastCost + " <= " + nodeCost, lastCost <= nodeCost);
        lastCost = nodeCost;
    }
}

Running this code shows that in some cases, the BinaryMinHeap (basically the priority queue used for Dijkstra's algorithm) sometimes finds single entries that should have been returned a while ago. This might explain the strange small detours observed, as probably the node of the most direct way got "stuck/delayed" in the priority queue and returned too late, when an alternative route was already found.

I have not yet a solution, as I am still trying to understand the algorithm in detail, but it could be part of the explanation of the routing weirdness.

Trivia fact: This code is more than 7 years old...

UPDATE

The error only happens when classicalRemove = true , but the default is =false, so probably this is unrelated to the observed problems...

kainagel commented 3 years ago

Maybe we should replace the self-written data structure by a library version? When the router was written, this may not have been available for Java, but it is now 7 :-) years later ...

mrieser commented 3 years ago

Again, working on routing algorithms, I came across such a situation in the ReRoutingIT, where two algorithms differed only in such small detours:

example1 example2

I looked at the first case in more detail, the following links are of interest: example1detail

I then wrote out the travel times during routing:

INFO TravelTimeCalculator:381 travel time on link 5722 at time 30130.6145610335 is 20.0
INFO TravelTimeCalculator:381 travel time on link 5740 at time 30130.6145610335 is 10.0
INFO TravelTimeCalculator:381 travel time on link 4723 at time 30140.6145610335 is 10.0

INFO TravelTimeCalculator:381 travel time on link 5722 at time 86813.14814497733 is 10.00806405160993
INFO TravelTimeCalculator:381 travel time on link 5740 at time 86813.14814497733 is 6.700783142625447
INFO TravelTimeCalculator:381 travel time on link 4723 at time 86811.13213207484 is 7.992051149127355

As one can see, at the morning trip, the travel time on the detour is exactly the same as on the direct link.

AStarLandmarks (actually, all implementations based on Dijkstra) has the following condition for this case:

} else if (totalCost == nCost) {
    // Special case: a node can be reached from two links with exactly the same costs.
    // Decide based on the linkId which one to take... just have to common criteria to be deterministic.
    Link prevLink = data.getPrevLink();
    if (prevLink != null && prevLink.getId().compareTo(l.getId()) > 0) {
        revisitNode(n, data, pendingNodes, currTime + travelTime, totalCost, l);
        return true;
    }
}

So this explains why the detour is taken in this case, as can be seen on the link ids in the above screenshot.

As the travel times are always integer amounts of seconds as soon as a vehicle has passed a link in the same time bin the iteration before, the chance that travel times are exactly the same is actually not that small. So probably this is what is being observed in the cases reported in this issue.

The alternative algorithm which currently takes the "more direct" path has no such logic for handing the equality case. Instead, it only revisits the node if the new cost is actually lower. This can be re-interpreted like this: If a node can be reached from two links at the same time, the link with the lower cost on its from-node will win (as this node will have been expanded before the other, and thus sets the cost first). Considering that the "detour" has an additional node, it's easy to see why the more direct link will usually win in these cases. But it might fail in other cases where the detour is not only 2 links versus the direct 1 link, but is more complex.

I don't think there is an easy solution to this. Clearly, if there are similar alternatives, it would probably make most sense to take the shorter one, but this might be quite difficult and computationally expensive to implement in the routing algorithm. So one has to stay with one or the other of the existing ways to handle the issue of same-time/cost-arrivals at nodes.

neuma commented 3 years ago

Interesting. The two link detours look similar to what I have observed. Maybe this may also be able to trigger more complex patterns as in the examples of https://github.com/matsim-org/matsim-libs/issues/1232#issuecomment-732295011

I fail to see how this may explain the original observation though. That is, detours only show up if the ride mode is removed from some of the links, and detours occur in areas far away from those modified links. Any idea on that?

mrieser commented 3 years ago

How is ride simulated? Do ride travelers maybe influence the travel times? Because as soon as the travel times on the links change by 1 second, two alternatives might no longer be equal and then other routes may be found. Just an idea, though...