N-Wouda / Euro-NeurIPS-2022

OptiML's contribution to the EURO meets NeurIPS 2022 vehicle routing competition.

Other

16 stars 2 forks source link

Algorithm tuning #33

Closed N-Wouda closed 2 years ago

N-Wouda commented 2 years ago

There are lots of parameters to the HGS algorithm, not many of which seem to be tuned particularly well. At some point, we should run a tuning tool (e.g., smac) to determine a good set of parameters.

Parameters, in this sense, also includes "which operators to use" (see also e.g. #32).

N-Wouda commented 2 years ago

I'm running 200 scenarios with Leon's suggested parameters (see #152 for details), and five solver seeds each. So 1,000 experiments, each lasting up to 8h. The first 200 of those are underway, and I hope the rest completes by tomorrow.

jmhvandoorn commented 2 years ago

Perfect. Let me know if there would be anything left to be done (or run).

leonlan commented 2 years ago

I ran main on final time limits for nbIter 2K-10K (1K step size), and 12.5K-20K (2.5K step size), each with 10 seeds. This is the result:

Seems like 5K, 8K and 12.5K improve the baseline (10K) the most. I'm now running these values on 30 seeds and will pick the best one.

N-Wouda commented 2 years ago

The first batch of 500 experiments have nearly finished. I just started the second batch, that should hopefully complete overnight/early tomorrow morning. Then we should have a final dynamic config later tomorrow.

@leonlan so as long as we pick something in (5K, 10K), we're more or less set, with perhaps good values being either 5K or 8K? I had expected a slightly smoother figure, and am now unsure what to make of this exactly.

leonlan commented 2 years ago

I was surprised by that as well. Turns out my experiments were not using 10 different seeds but just a single fixed value. 🤦 I'll rerun the experiment for a subset of the values (5k-10k) but with correct seeds.

(My 30 seed experiment failed anyhow because I ran out of budget on my GPU account)

N-Wouda commented 2 years ago

152 now contains the new dynamic parameters. I'm running a few more evaluations (including the baseline) to make sure they're, in fact, better than what we had. Expect those results in 2-4 hours.

leonlan commented 2 years ago

These are the correct results for nbIter:

The differences are minimal when comparing 8K to 10K (-5 pts). Similar to the last figure, I would have expected the figure to a bit smoother with the 9K. I don't think it's worth changing the nbIter value.

N-Wouda commented 2 years ago

Expect those results in 2-4 hours.

/	Baseline	Tuned
Quali	370991.3	369459.2
Final	369536.8	368541.7

So an improvement of 1.5K with the qualification time limit, and around 1K using the final time limit.