ai-se / hyperall

Compare and contrasts hyperparameter techniques
https://arxiv.org/abs/1807.11112
2 stars 0 forks source link

Initial Results #15

Open HuyTu7 opened 6 years ago

HuyTu7 commented 6 years ago

Concerns: IQR of the ranks is concerning

25_f1 25_precision 50_f1 50_precision 100_f1 100_precision

Note:

timm commented 6 years ago
vivekaxl commented 6 years ago
  • why an eval budget of 50,100? what do the smac people say about that?

This is to show how the effectiveness of the optimizer depends on the number of evaluation.

  • what is DTC?

Decision trees

Roger that.

  • results should be not be presented across N data sets. need specifics per data sets.

ok

  • when do we get to see FLASH results?

Friday.

  • need to see the runtimes (or #evals) of each method... so we can assess the performance gains versus the computational effort

Will Do. Ken in on top of it.

timm commented 6 years ago

the above sounds great

but you dodged one question about your eval budget:

what do the smac people say about that?

vivekaxl commented 6 years ago

what do the smac people say about that?

smac people say that number of evaluations is not a good stopping criterion in the real world setting. Instead, the stopping rule should be defined in terms of time. This is the reason evaluation based stopping criterion is not even documented.

HuyTu7 commented 6 years ago

Each image file in the folder of a n_evaluations result is the scott-knott test chart for f1, precision, and time respectively. example

25 evaluations results 50 evaluations results

Summary of the results: consolidated results.

Notes:

  1. RF is the clear learner winner and SMAC is the clear optimizer winner.
  2. The default configuration of learners is also very effective. Also, note that we have not considered cases where there is no clear winner
timm commented 6 years ago

Fft? Flash?