[Question] Algorithm configuration for a small config space - too few evaluations

graidl commented 1 week ago

We intend to use SMAC to optimize only few discrete parameters with small domains of a stochastic metaheuristic using the AlgorithmConfigurationFacade. Our issue is that SMAC stops far too early.

To be a bit more specific:

Let the config space bet

config_space = ConfigurationSpace({
        "y": (1, 3), 
        "z":["opt1", "opt2"],
    })

Thus, we have just 3*2=6 possible configurations. Moreover, let us use just one instance.

SMAC stops already after about 26 algorithm evaluations with the message Could not return a new configuration after 16 retries, despite we set n_trials to 2000 and deterministic to False, and the metaheuristic returned quite different values for the same configurations.

Even though the search space for SMAC is very small, clearly a larger number of evaluations would be needed to conclude with some statistical significance that one configuration is best.

What am I missing?

AlgorithmConfigurationFacade sets max_config_calls in the intensifier to 2000, so this cannot be the issue.

Note also that the table in https://automl.github.io/SMAC3/main/3_getting_started.html says that the AlgorithmConfigurationFacade uses the Hyperband Intensifier, which, however I don't think is true.

timruhkopf commented 1 week ago

Hi, thank you for your issue!

Could you maybe please give us a minimal working example to reproduce the problem that you are facing? From what i gather, this should be the line from the log

Regarding your note: you are absolutely right, and i will open a documentation issue #1165 to update our previous oversight.

timruhkopf commented 1 week ago

looking into the issue a bit more, there are two reasons for why the config ended up in the ConfigSelector._processed_configs and thus is not new to it:

you added the config to the runhistory before starting the execution
the initial design already is exhaustive wrt your tiny search space; in which case you don't need to evaluate the configuration anymore.

both of which will increase the counter of retries for pre-existing configurations.

if you only do have so few design decisions in your search space, why don't you just enumerate them. Are you investigating seeds or instances?

graidl commented 1 week ago

Thanks for your answers. Here is a small demo code that stops after about 20 evaluations instead of the indicated 200 trials:

from ConfigSpace import Configuration, ConfigurationSpace
from smac import AlgorithmConfigurationFacade, Scenario
import random

config_space = ConfigurationSpace({
        "x": (1, 3), 
        "y": (1, 2),
    })

instances = ["myinst.txt"]
features = {fn: [i] for (i, fn) in enumerate(instances)}

scenario = Scenario(config_space, deterministic=False, 
                    instances=instances, instance_features=features, 
                    n_trials=200)

def f(config: Configuration, instance: str, seed: int) -> float:
    x = int(config["x"]); y = int(config["y"])
    print(f'f({instance}, {seed}, {x}, {y})', end=" -> ")
    res = x - y + random.random()
    print(res)
    return res

smac = AlgorithmConfigurationFacade(scenario, f, overwrite=True)

incumbent = smac.optimize()
print("Optimized configuration: ", incumbent)

The point is that the function `f`` for which the parameters should be optimized is not deterministic, which we also indicate in the scenario.

Yes, enumerating all possible configurations is feasible, in this simple example there are just 3*2=6. However, each of them needs to be evaluated a serious number of times to be able to decide that one is (statistically) better than another. SMAC seems to consider a configuration "fully processed" just after a very low number of evaluations with different seeds, maybe 3? This is clearly far to less for any statistical significance. I would have expected a parameter to set a higher number of seeds to be tried but did not find a solution to this.

Let me finally just explain that in our real application, we indeed also just want to optimize two discrete parameters although with larger domains. And doing a grid search by enumerating all configurations and doing e.g. 30 evaluations is what we wanted to avoid since each run takes quite some time.

timruhkopf commented 4 hours ago

Thanks for providing us with the minimal viable example. We will have a look into it. But from what you describe this might likely originate from the way the intensification works.

@mwever raised the issue, that for classic AlgorithmConfiguration, we usually assume a weak homogeneity; which makes it easier to estimate the performance and take action based on that more confidently. What you describe on the other hand basically implies that you have quite overlapping performance distributions and there is no clear winner unless we have a very good (expensively acquired on many seeds) performance estimate. Smac is intended to aggressively reject, but be gracious, should we re-encounter a configuration.

I have another conceptual issue, but first i want to verify that the seeding is not implicitly creating a new instance. Then we can move forward and assume that the seeds are independent draws from the performance distribution of that algorithm. If you then are racing algorithms against each other wrt the number of seeds, and say you used the mean of the performance distribution to make such a decision, then you are comparing estimators that have a varying amount of confidence about them I am not sure right now if that is already implemented/intended in that way in smac.

automl / SMAC3

[Question] Algorithm configuration for a small config space - too few evaluations #1160