Closed Innixma closed 1 year ago
Things changed since October. At the moment, there is a considerable price difference in spot instance pricing (m5.2xlarge
at 0.1047
and m6i.2xlarge
at 0.1664
) and this holds true for most regions (around ~50%). While this can be more or less counteracted by using reduced train times, if wall-clock time is the only benefit I would prefer to keep the same runtime and instance types as previous experiments, at least for our revision.
Hey @PGijsbers, that is totally fair! I would say it would be good to keep an eye on this, as the value proposition could shift over time depending on outside demand. A basic expectation is that m6i.2xlarge is 43% faster than m5.2xlarge, so for them to break even, when m5.2x = $0.10/hr
, then m6i.2x would be $0.143/hr
. As you mention, this means m5.2x is the way to go for spot pricing at present (although not true for all regions as I note below).
The other benefit of keeping m5.2x
is that we can compare directly with prior benchmark runs (such as the 2022 paper results), without having to worry about effective compute time differences.
Note: Europe (Stockholm)
currently has m6i.2x
at $0.1204/hr
, which is a pretty good price.
This is maybe something you should decide directly before running the benchmark, since the prices seem to fluctuate significantly between regions.
I will close this PR. As it stands, which EC2 instance is right should be evaluated before running experiments. I don't see a good reason to change the default at this moment, though. The small benefit of slightly cheaper (or more) compute in Stockholm doesn't outweigh the other m5.2xlarge
benefits at this moment, in my opinion (I also noticed that cheap region). It is something we would still consider in the future, and for new experiments I would recommend users to explore (if they don't want to directly compare to previously obtained results), but I don't see that as a reason to keep the PR open.
I propose that future benchmarks should swap from
m5
tom6i
AWS instances.m6i
instances are the next generation of CPU instances and offer superior performance for very similar price.m6i.2xlarge
Geekbench Result: https://browser.geekbench.com/v5/cpu/18064421m5.2xlarge
Geekbench Result: https://browser.geekbench.com/v5/cpu/17423989These results showcase
m6i
single core performance is 43% faster thanm5
. Overall multi-core performance showcasem6i
is 36% faster thanm5
.My experiments show that AutoGluon trains ~40% faster on
m6i
than onm5
. Notably, this also improves inference speed by a similar amount. I expect that these speedups will be similar for all frameworks, since it is a generic CPU speedup.To match the compute of
m5
for 1 hour, we only need to train for ~43 minutes onm6i
.At time of writing, on-demand price on US East 2 (Ohio) Region for
m6i
andm5
are identical:At time of writing, spot pricing on US East 2 (Ohio) are nearly identical:
I think this points to
m6i
as being the cost efficient instance for future benchmarks.