openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
391 stars 130 forks source link

Optionally limit inference time measurements by dataset size #538

Closed PGijsbers closed 1 year ago

PGijsbers commented 1 year ago

Measuring inference time on very wide datasets is problematic if the batch size is sufficiently big. This stems from two issues:

This PR makes it so that it is possible to only measure inference time on batches that do not exceed the initial dataset size. The first issue should be addressed if this option is turned on in almost all cases, as the batch size should never exceed the dataset size. The second issue may remain, but will be less likely as the maximum batch size is now proportional to the training data the automl framework already evaluated their models on.

This seems like a fair compromise, as it seems reasonable to assume that the inference batches in practice will not be (significantly) greater than the training dataset.

It also lowers the default number of repeats from 100 to 10. From my own (admittedly small scale) experiments, the variance and cold-start effect isn't big enough to need more than 10 measurements to filter out.