Broken fANOVA - Githubissues

mlindauer commented 7 years ago

Running on the spear-qcp example (from SMAC3)

$ python ~/git/ParameterImportance/scripts/evaluate.py --scenario_file smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/scenario.txt --history smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/runhistory.json --trajectory smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/traj_aclib2.json  --num_params 10 --modus all
[...]
INFO:fANOVA:PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING
INFO:fANOVA:Finished Preprocessing
Traceback (most recent call last):
  File "/home/lindauer/git/ParameterImportance/scripts/evaluate.py", line 41, in <module>
    result = importance.evaluate_scenario(args.modus)
  File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 240, in evaluate_scenario
    self.evaluator = method
  File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 160, in evaluator
    to_evaluate=self._parameters_to_evaluate)
  File "/home/lindauer/git/ParameterImportance/pimp/evaluator/fanova.py", line 31, in __init__
    self.evaluator = fanova_pyrfr(X=self.X, Y=self.y.flatten(), config_space=cs, config_on_hypercube=True)
TypeError: __init__() got an unexpected keyword argument 'config_on_hypercube'

AndreBiedenkapp commented 7 years ago

@mlindauer: Did you install fANOVA as listed in the requirements file, i.e. git+http://github.com/automl/fanova@952c9bd46b47cde87036c00f974629c9e5819565 (I'm still working on getting a tag for fanova so we don't have to reference commit numbers) If so this shouldn't happen. If you've installed it manually, then you've installed the wrong branch. The pyrfr_reimplementation branch is what we need here.

mlindauer commented 7 years ago

yes, indeed, I used the wrong branch. After reinstalling fANOVA, I'm getting a new error now:

INFO:Importance:Running evaluation method fANOVA
Traceback (most recent call last):
  File "/home/lindauer/git/ParameterImportance/scripts/evaluate.py", line 41, in <module>
    result = importance.evaluate_scenario(args.modus)
  File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 247, in evaluate_scenario
    return {evaluation_method: self.evaluator.run()}
  File "/home/lindauer/git/ParameterImportance/pimp/evaluator/fanova.py", line 76, in run
    idx, param.name, self.evaluator.quantify_importance([idx])[(idx, )]['total importance']))
  File "/home/lindauer/anaconda3/lib/python3.6/site-packages/fanova/fanova.py", line 280, in quantify_importance
    [self.V_U_total[sub_dims][t] / self.trees_total_variance[t] for t in range(self.n_trees)])
  File "/home/lindauer/anaconda3/lib/python3.6/site-packages/fanova/fanova.py", line 280, in <listcomp>
    [self.V_U_total[sub_dims][t] / self.trees_total_variance[t] for t in range(self.n_trees)])
ZeroDivisionError: float division by zero

sfalkner commented 7 years ago

This mean that one of the trees doesn't have any variance. Which can happen in the following scenarios:

you have cutoff all the data
all the data points have the same performance
you only have 1 data points

None the less, the error shouldn't happen, and this issue actually belongs into the fANOVA. Could you guys try to take a snapshot of the data fed into the fANOVA to reproduce that on my side and fix it. Thanks!

automl / ParameterImportance

Broken fANOVA #16