openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
399 stars 132 forks source link

[Feature Request] autoretry #484

Open Innixma opened 2 years ago

Innixma commented 2 years ago

I'd like to have an auto-retry option incase an error occurred that wasn't related to the exec.py call of a framework (for example, OpenML server error, failure to get an instance etc.). This is very important to avoid penalizing automl systems in comparisons due to failures outside of their control.

Example API:

python3 runbenchmark.py h2oautoml validation 1h4c -m aws -autoretry 2

This would retry up to 2 times (3 attempts total). This would be very helpful in the cases where I run 1040 datasets and 3 of them failed due to problems unrelated to the exec.py logic (and often they won't even be part of the results.csv, so I have no idea what happened, and if I run again they will likely succeed and then another random 3 will fail).

PGijsbers commented 2 years ago

I agree that this will be useful. For the meantime:

if I run again they will likely succeed and then another random 3 will fail

I suggest to "continue" the run with --resume? This will automatically skip any experiments already in results.csv, which makes it easy to execute only those missing experiments.