Describe VRP algorithm - Githubissues

nielsenko commented 5 years ago

Can you point to some published research on the heuristics you use for solving the rich VRPs supported by vroom?

jcoupey commented 5 years ago

There is no structured paper/page for this right now, except the code itself and my talk at SOTM 2018 outlining the solving approach (CVRP only at that time, but still valid).

Heuristics

We have a couple heuristics to find initial solutions:

an adjusted version of the "famous" Christofides heuristic for TSP
- custom clustering heuristics using spanning trees (CVRP only), no reference for those as we designed them especially for this use-case, more details in https://github.com/VROOM-Project/vroom/issues/74
- adjusted versions of the Solomon I1 heuristics (VRPTW and CVRP) ; see original paper or a very good two-part review paper (1 and 2)

Metaheuristics

We use a local search procedure, starting from initial solutions, to check for valid neighboring solutions and improve the current solution. This is a very common approach, we've chosen relevant modification operators that are usually easily found in the literature.

We put a lot of effort to have a fast neighbour evaluation procedure (validity and potential gain) because it's a bottleneck in the local search. Also we use a fast greedy procedure (similar to a gradient descent) to pick the next neighbouring solution, because it's usually "good enough" and keeps computing times very low.

We should probably have a wiki page with this kind of details/info... May I ask you if there is a specific reason why you're looking for references?

nielsenko commented 5 years ago

I have been challenged with solving a very large VRPTW problem, 40.000+ jobs. This is not possible in reasonable time today with vroom. Building the duration matrix alone, is prohibitive expensive.

So far I have resorted to partitioning the jobs externally to vroom (using KMean, HDbScan, SLink, etc. Would like to try DeLiClu og dGridSLINK but that is more work) and then calling vroom on each of these subproblem (each a single vehicle VRPTW). It works, it is fast, but the quality of the solution (to the trained eye) is lacking.

So, I have been scouring the net for tricks, and was wondering how they would each fit with vrooms design. Hence the question.

jcoupey commented 5 years ago

I have been challenged with solving a very large VRPTW problem, 40.000+ jobs

Wow, sounds like fun. ;-) Also puts https://github.com/VROOM-Project/vroom/issues/237 into perspective.

I'll have to check out the clustering algorithms you mention (some of them are totally unknown to me). Also happy to have some feedback if you have interesting findings during this project and VROOM is part of the workflow!

MedAlyH commented 5 years ago

Hello ,

Thank you for replying ;

Can you please provide me with a description of the resolution of the TSP problem ( algorithm, the choice of the algorithm why not using the greedy algorithm for example ) in the project . I really need it to better understand . Thank you in advance ! I will be very grateful if you can help me with that .

nielsenko commented 5 years ago

So reading up on #75 and given my vague understanding of the code, I gather that you are using a split-first, route-second approach.

I was wondering, if one could use an Euclidean distance matrix for the partitioning, and only use the true duration matrix when performing the routing for each vehicle. Maybe the local search step could make up for the occasionally dumb partitioning?

The advantage would be, that the clustering step (which is derived from Prim's algorithm), can complete much faster (or at all) when the full matrix doesn't need to be computed first, and the actual duration matrix needed during routing of each vehicle could be much smaller (based on the size of the sub-problem).

jcoupey commented 5 years ago

@MedAlyH the TSP heuristic is based on the Christofides algorithm. Partly because it provides a guarantee wrt the gap to the optimal solution, but mostly because at the time of implementation I was curious to see how it would perform on real-life (asymmetric) problems.

The outline is:

we first "symmetrize" the problem then apply the heuristic
we apply a local search phase on the symmetric version (see https://github.com/VROOM-Project/vroom/blob/master/src/problems/tsp/heuristics/local_search.h for TSP-specific operators used)
we go back to the asymmetric case and apply a new local search phase

The local search phase is basically: lookup all valid options and pick the best one for operators in turn, until no improvement is left. There are also a couple hacks with zero costs edges or prohibitive costs edges to handle open trips (with only start or end provided). I you need more details, I'm afraid you'll have to dive into the code. ;-)

Results on TSPLIB are on the benchmarks wiki page. From what I recall of the tests, applying the heuristic itself yields solutions that are approximately in a range of 10%-15% over the optimal, then the local search usually brings them down in the 0%-5% range. So all in all "good enough" and quite fast.

jcoupey commented 5 years ago

I gather that you are using a split-first, route-second approach.

Correct.

I was wondering, if one could use an Euclidean distance matrix for the partitioning

That would be a great use-case for https://github.com/VROOM-Project/vroom/issues/240. You could indeed then stop once the clustering is done, then split your whole problem into smaller ones and solve them independently, removing the need to compute the whole real matrix.

This would basically amount to comparing how the Prim-based spanning tree clustering compares to the ones you mentioned above for that purpose. Note that you'd still have the same biaises when it comes to solution quality.

Side note 1: I had in mind to extend this clustering approach with TW, see https://github.com/VROOM-Project/vroom/issues/139 Side note 2: the Prim-based clustering heuristic has two variants, sequential and parallel. The later makes sense with a distributed fleet (different start and end for vehicles) but as far as I recall, it's more likely to have scaling issues if you plan to use huge number of jobs.

nielsenko commented 5 years ago

This would basically amount to comparing how the Prim-based spanning tree clustering compares to the ones you mentioned above for that purpose. Note that you'd still have the same biaises when it comes to solution quality.

Except, I would still let vroom swap jobs between routes, during the local search phase. Or is that a bad idea?

jcoupey commented 5 years ago

I would still let vroom swap jobs between routes

Then you'd treat the problem as a whole and you'd still have to compute the full matrix?

Maybe we should open a dedicated ticket for this discussion?

nielsenko commented 5 years ago

Then you'd treat the problem as a whole and you'd still have to compute the full matrix?

That would defeat the purpose. I was hoping that the local search phase would only consider a subset of matrix entries, that could be filled lazily. My thought was that the move based approach, since it considers only a subset of moves (modification operators), would not (necessarily) need the full matrix.

Regarding a seperate ticket. I guess there is a feature request hidden here, if a feasible approach can be invented.

jcoupey commented 5 years ago

I was hoping that the local search phase would only consider a subset of matrix entries

If you don't restrict it yourself, the local search as it is implemented can potentially lookup any cost between any pair of input location (vehicle start/end or job).

I guess there is a feature request hidden here, if a feasible approach can be invented.

Having a discussion on pre-clustering strategies (and results) in a dedicated ticket could probably be of interest to several users.

MedAlyH commented 5 years ago

@jcoupey Hello Sir , I have a question for you

Christofides' approximation algorithm is guaranteed to give a solution with a cost no larger than 1.5 times the optimal tour. But this guarantee only works for symmetric graphs with distances that obey the triangle inequality. SO i'am wondering what we do if the graph do not obey the triangle inequality ? how we can apply the Christofides algorithm ? I have also another question , what's the utility of the local search ? what improvment did it import to the solution ? Thank you for replying

jcoupey commented 5 years ago

this guarantee only works for symmetric graphs with distances that obey the triangle inequality

You're right, we have no guarantee of that kind on the initial problem because we treat the asymmetric version and the triangle inequality does not hold with e.g. OSRM durations. Again the Christofides heuristic is used as a subroutine and the whole process yields solutions that are good starting points. If you check out the benchmark results for TSP, you'll see that we're (fortunately) way under the 1.5 factor.

For details on the local search phase improvements and operators, please refer to my talk at SOTM 2018, especially starting around 8:52.

MedAlyH commented 5 years ago

@jcoupey

Thank you for replying ;

But I still not get it ! if the problem is not symmetric and do not obeyl the triangle inequality what the Vroom do in that case .

Sorry for asking many questions, because i really apretiate your solution and i want to understand it from A to Z . Thank you Dear

jcoupey commented 5 years ago

what the Vroom do in that case .

The TSP solving (including going through the symmetrized version) happens here, so if you want to understand all the details you'll have to dig into the code. The outline of the solving approach is in TSP::raw_solve from the same file.

jcoupey commented 5 years ago

Closing as stale. We should probably open new tickets for specific topics if required in the future.

VROOM-Project / vroom

Describe VRP algorithm #245

Heuristics

Metaheuristics