de-randomize benchmarks

automl / HPOlib1.5

GNU General Public License v3.0

69 stars 27 forks source link

de-randomize benchmarks #2

Open KEggensperger opened 7 years ago

KEggensperger commented 7 years ago

Every benchmark should accept a rng (no matter whether it uses it). Currently, no benchmark accepts a seed or rng and therefore randomly shuffles data and creates models:

Examples https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/svm_benchmark.py#L41 https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/fully_connected_network.py#L103 https://github.com/automl/HPOlib2/blob/master/hpolib/benchmarks/ml/fully_connected_network.py#L103

Default should be "None", which means, that the rng will be instantiated at random.

KEggensperger commented 7 years ago

FIXED with a0cce4fcd15038c116285bc41844cd5fc0278218

KEggensperger commented 7 years ago

I am reopening this issue, because right now, evaluating the same configuration twice in a row will return different performances, because the RNG is instantiated in init().

I would propose to add a rng also to the objective function to assure a deterministic benchmark if necessary and initialize it with NONE to use the class-RNG.

@aaronkl: Any thoughts on that as you fixed the previous issue?

KEggensperger commented 7 years ago

When building crossvalidation benchmarks, where we want to evaluate one fold at a time, the current workflow would evaluate different subset of datapoints each time the objective function is called (see explanation above).

I propose to move the seed from init() to objective_function()/objective_function_test() such that every function evaluation gets a seed.

@aaronkl @mfeurer Would you mind if we change that? Or do you have a simpler solution?

KEggensperger commented 7 years ago

@aaronkl would be okay as long as it is also possible to not specify a seed/rng. In these cases the rng created in init() will be used.