VROOM-Project / vroom

Vehicle Routing Open-source Optimization Machine
http://vroom-project.org/
BSD 2-Clause "Simplified" License
1.26k stars 326 forks source link

Solving progress indicator #1122

Open logikaljay opened 1 month ago

logikaljay commented 1 month ago

Hey guys,

Firstly - Love the work you are doing. Really impressive 🤯.

I have a semi-large dataset that I have to run daily (~ 2000x2000).

I currently have a POC setup that uses vroom-express and it works well, however I was hoping to improve this by being able to provide the user the "time remaining", or at least some sort of progress indicator.

I have had a quick look through the other issues, and cannot find any that matches this requirement.

Would anyone be able to provide guidance on the best way to achieve this?

jcoupey commented 1 month ago

Hum, the completion time is highly impacted by multiple factors. Of course problem size, hardware and parallelization level. But also the model constraints are important since they can affect the search convergence speed. And we don't have a "run this operation 10000 times then stop" approach with a clear stopping criterion, we keep on computing until we reach a local minima.

We run several searches in parallel and not all those searches will take the same time or "number of steps" for a given instance. For example if you have a very good heuristic solution for one search, then the local search will tend to converge faster. On the other hand, it may sometime take a big number of small improvements to end the search, while you could sometime achieve it in less steps with bigger improvements.

Also each search consists of several iterative local search runs. But knowing the number of remaining runs is pretty useless in term of progress since the total time for the iterations can wildly vary.

So all in all I'd say this is highly non-trivial. Maybe an easier approach would be, if you have a fixed setup (machine, load) and homogeneous instance types, to come up with statistic estimates that would be a good proxy for the expected computing time on the next run.