Question Results AutoML Benchmark Time

cynthiamaia commented 2 years ago

Hi guys! I have a question regarding the results table generated by the automl benchmark. It has 3 columns with: duration, training time and prediction time. I would like to know what is the difference in duration and training time? Does this training time consist only of the space search time for training the pipelines?

sebhrusen commented 2 years ago

Hi @cynthiamaia, you're correct, we measure training_time — i.e the space search time used by the framework to produce the best pipeline under given constraints — as we impose a constraint on it. The prediction_time is then measured independently (when possible). In that regard, duration is a purely technical metric as it may also include data preparation done before starting the training per se, as well as information extraction when the training and prediction are completed: we use it mainly to check unusual behaviours, and also because there's a technical hard limit imposed on this duration (after which the tool is trying to kill the framework's process(es)).

sebhrusen commented 2 years ago

Please don't hesitate to use GitHub Discussions rather than issues for this kind of question, thank you.

cynthiamaia commented 2 years ago

Thanks, one more question, I would like to know where are the results of the best pipelines found by the frameworks, is it in the log?

openml / automlbenchmark

Question Results AutoML Benchmark Time #482