jansel / opentuner

An extensible framework for program autotuning
http://opentuner.org/
MIT License
382 stars 112 forks source link

Including 'default' as a choice for boolean flags. #119

Open tdeneau opened 5 years ago

tdeneau commented 5 years ago

In the gcc example writeup in Section 3.1 of the paper , I see

it decides between turning the flag on (with -fFLAG), off (with -fnoFLAG), or omitting the flag to allow the default value to take precedence. Including the default value as a choice is not necessary for completeness, but speeds up convergence

Why would including the default value speed up convergence? I would naively think that it would enlarge the search space and increase time to convergence.

jansel commented 5 years ago

Given the type of techniques used that will make stochastic changes and follow gradients in the data, convergence speed is primarily a function of the difficulty of the search space, not its absolute size. As an example, a 2-dimensional flat plateau which provides no gradient and has a single outlier optimal point is far more difficult to search than a 100-dimensional bowl-shaped search space where you can easily follow the slope from any point to the optimal one. This is not true for all search techniques. For example, an exhaustive search would likely not work on the 100-dimensional space given its size.

Duplicating the default choice in GCC distorts the search space such that 2/3rds of the space are the choice the compiler writer thought should be the default and 1/3rd is the other option. This makes OpenTuner more likely to pick the default value and, assuming the writers of GCC chose good defaults, produces an average point in the search space that is more fit.

tdeneau commented 5 years ago

OK that makes a lot of sense. Some questions:

jansel commented 5 years ago

Implementing a WeightedIntegerParameter or WeightedEnum/Boolean would be easy to do. You would implement it with a projection from the unweighted space to a weighted space inside the parameter. You could look at the log scaled group of parameters as example of that.