BrunoRosendo / master-thesis

Source code for my master's thesis, in the topic "Quantum algorithms for optimizing urban transportation"
MIT License
6 stars 0 forks source link

Process Porto data #92

Closed BrunoRosendo closed 4 months ago

BrunoRosendo commented 4 months ago

After #91, I should process the data retrieved from Porto's metro and bus in order to properly run my algorithms. This might include organizing the available tables, using Google Maps if the time/distance between stations is not available or looking for extra data.

I think the trickiest part will be RPP, since I don't have data on the passengers' trips. I may look for this information elsewhere or try to come up with an assumption

BrunoRosendo commented 4 months ago

After analysing the metro's data, it only makes sense to use RPP since metros always go back and forth instead of being in a circle.

I reached the following steps:

The most typical vehicle has space for 216 people (80 seated + 136 standing): https://pt.wikipedia.org/wiki/Metro_do_Porto#Flexity_Outlook/Eurotram_(MP001_a_MP072) There are other types of vehicle, but this will be my reference

BrunoRosendo commented 4 months ago

As for STCP buses:

BrunoRosendo commented 4 months ago

For some reason, RPP is not respecting the "no return to depot" penalty and is ignoring a few stops in the end... I need to figure this out. I could simply give the algorithm more time, I need to test this

I can also check if it's related to the normalization and the really big numbers in the distance matrix, but I don't think so...

Edit: I think the problem is just too big, I need to try with fewer vars

BrunoRosendo commented 4 months ago

The classic solver is also having some trouble. The stops do not correspond with the line I'm testing

Edit: SOLVED, it was a problem when building the distance matrix and location names in remove_unused_locations

BrunoRosendo commented 4 months ago

For some reason, RPP is not respecting the "no return to depot" penalty and is ignoring a few stops in the end... I need to figure this out. I could simply give the algorithm more time, I need to test this

I can also check if it's related to the normalization and the really big numbers in the distance matrix, but I don't think so...

Edit: I think the problem is just too big, I need to try with fewer vars

I tried running in main.py after copying the simplified input from the classic solver and I got the right result... I will check if there's something wrong with how the RPP algorithm simplifies the input

BrunoRosendo commented 4 months ago

For some reason, RPP is not respecting the "no return to depot" penalty and is ignoring a few stops in the end... I need to figure this out. I could simply give the algorithm more time, I need to test this I can also check if it's related to the normalization and the really big numbers in the distance matrix, but I don't think so... Edit: I think the problem is just too big, I need to try with fewer vars

I tried running in main.py after copying the simplified input from the classic solver and I got the right result... I will check if there's something wrong with how the RPP algorithm simplifies the input

Conclusion: Everything is fine with the algorithm, it was just the stop trindade being in a different place in the locations array, even though it doesn't make a difference for the problem definition. However, the optimizer seemed to randomly get to the right solution sooner when using one of the formulations (the simplified one)

BrunoRosendo commented 4 months ago

Here's the RPP algorithm running on the D line stops (not a surprising result, since the line is straightforward)

image

BrunoRosendo commented 4 months ago

This is line A, with 1000+ vars, running on dwave hybrid system. This is impressive, especially since the local CPLEX optimizer can't even run this on the free version

image

BrunoRosendo commented 4 months ago

I'm having some trouble stretching the y axis, plotly is not making things easy... A workaround is to set the height of the figure and then zoom out like this

image

BrunoRosendo commented 4 months ago

Ignoring previous comments, these are the tasks left:

BrunoRosendo commented 4 months ago

image

Circular line being handled by VRP

26 locations 675 variables, 652 constraints, 3750 biases, 5.3s and 32ms with hybrid cqm solver

BrunoRosendo commented 4 months ago

image

Regular bus line

BrunoRosendo commented 4 months ago

image Multiple bus lines are now working :partying_face: This was 5s on dwave hybrid solver

BrunoRosendo commented 4 months ago

image

This was 3838 vars/qubits on dwave hybrid solver, in 30 seconds. UHUUU lines 18+304

BrunoRosendo commented 4 months ago

I implemented a solution for multiple lines but my first tests didn't go very well. I should test more and fix the problems

BrunoRosendo commented 4 months ago

After some more testing, I have new conclusions.

RPP (straight lines):

This happens because each location is only visited once and, if we try duplicating them, it's impossible to satisfy all trip requests. I tried satisfying only some trips but that resulted in unexpected behaviour from the algorithms. OR tools doesn't even yield a solution.

CVRP (circular lines):

Adjoint lines aren't possible because we need a common depot. It's possible to support multiple lines if ALL of them have a common stop (the new depot). I need to duplicate other shared stops (w/ no connection between them) and make sure the dist. matrix is bi-directional, since we're changing the depot

BrunoRosendo commented 4 months ago

Things to do for CVRP (multiple lines):

BrunoRosendo commented 4 months ago

image This are lines 300 and 3002 together with 2 common stops and 1 vehicle. The result is not surprising

25 locations 624 variables, 602 constraints, 3456 biases

BrunoRosendo commented 4 months ago

There's still something wrong, sometimes not all connections are possible and I get infeasible results

BrunoRosendo commented 4 months ago

image This are lines 300 and 3002 together with 2 common stops and 1 vehicle. The result is not surprising

As we can see, doing the same example with the distance API yields a different result. I think I'm finally able to get something interesting out of this

image

BrunoRosendo commented 4 months ago

There's still something wrong, sometimes not all connections are possible and I get infeasible results

I think I solved this with the logic below, there were some shenanigans with when I should be duplicating the depot

# Delete duplicates of the new depot. Needed to avoid unfeasible solutions with multiple vehicles.
    to_delete = min(NUM_VEHICLES - 1, most_common_freq - 1)
    i = 1
    while i < len(location_ids) and to_delete > 0:
        if location_ids[i] == most_common_id:
            del location_ids[i]
            del locations[i]
            del location_names[i]
            to_delete -= 1
        else:
            i += 1

Now I can work with 2 lines + 2 vehicles, for example.

image

or even more vehicles, although the result is the same

image

BrunoRosendo commented 4 months ago

I believe I can do useful things with my current setup. For example, 3 lines together:

image

Or at least confirm if the current lines are optimized (order of stops, not client satisfaction).

1821 variables, 131 samples, 30s runtime, 32ms qpu

I can also try to add a few stops if I want. I will create better examples while I'm writing the thesis

THIS CLOSES THE DEVELOPMENT PHASE UHUUU :partying_face: