Open MysteriousShadow opened 1 month ago
Hey, @MysteriousShadow ,
So, in this specific example, these two optimal solutions you wrote have the same total distance.
The heuristic methods implemented here have a random component that makes them explore the possible routes in different ways every time you run them. This is intrinsic to these methods, and in fact this is one of the factors that allow them to (typically) get good solutions in reasonable time.
This of it in this way: imagine a mountain with two far-away valleys of the same height, and you wish to find its lowest point. The deterministic methods will start at the same point and follow the same path, so it will always end up in the same valley. The heuristics here, due to their random nature, may start from different points and follow different paths each time, and thus may end up in different valleys.
If you want to get deterministic results, you can do one of the following.
This is the same as fixing the same starting point and path:
import random
# Set this before all optimizations
random.seed(1) # or any other number you prefer
The following figure shows optimizations for the same example you used without setting a seed. We may get different routes as expected, but in this case they always have the same total distance, so both quality as optimal solution.
When setting a random specific seed before each optimization, the results are always the same now.
If your problem instances are too large for the exact methods, you may use deterministic heuristics. Typically constructive heuristics fall into this class, such as "nearest neighbors", "christofides" etc.
Unfortunately this library does not have such methods implemented, so you may need to resort to other libraries such as OR-Tools.
With that said, I highly recommend against forcing specific seeds. The random nature is very important for making the heuristics good, and it is possible that you pick a seed that generates a very bad path and you always get a bad solution.
If you still prefer this route, then I suggest going with a multi-seed view: run the optimization for multiple seeds (say 5 or 10) and pick the one with best distance. In this way you get a mix of the benefits of both approaches.
Just using the examples on the readme here, the exact version
solve_tsp_dynamic_programming
returns [0, 1, 3, 2] as the permuation.However, when using the heuristics version
solve_tsp_local_search
, it would sometimes return [0, 2, 3, 1].In fact,
solve_tsp_local_search
can return either [0, 1, 3, 2] or [0, 2, 3, 1]. It seems to be randomly choosing which one to return.Is this intended? I'd rather the output be deterministic. I have large data sets, and the exact method takes too much time and memory.