Closed becktepe closed 3 weeks ago
I think this is probably related to #1131 - currently I'm testing what happens without the hypersweeper to. Performance seems pretty whack, so this could explain why.
Yes, it might be. For my experiments, I set job_array_size_limit=1
in the hypersweeper to avoid any interactions related to parallelization.
Unfortunately this probably doesn't help. I just posted the updated results and even though anytime performance has a logging bug, it looks like SMAC is just doing random search here :/
Maybe this could also have been caused by passing seed=None to the tell() method where SMAC was expecting something else. If this is the case, it might not actually be a SMAC problem
I don't think so, I ran the "vanilla" SMAC version with ask-tell without seed=None and curves were exactly the same. This is the code, produced the same curve as hypersweeper with max_parallel 0. Idk if there's something else that could be wrong here?
import hydra
from carps.utils.running import make_problem
from smac.facade.multi_fidelity_facade import MultiFidelityFacade
from smac.runhistory.dataclasses import TrialValue
from carps.utils.trials import TrialInfo
from smac import Scenario
from pathlib import Path
import json
@hydra.main(config_path=".", config_name="config_vanilla_smac.yaml")
def run_carps(cfg):
problem = make_problem(cfg=cfg.problem)
# Scenario object
scenario = Scenario(problem.configspace, deterministic=False, n_trials=126, min_budget=1, max_budget=52, seed=cfg.seed)
intensifier = MultiFidelityFacade.get_intensifier(
scenario,
eta=3
)
def dummy(config, seed, budget, **kwargs):
return 0.0
# Now we use SMAC to find the best hyperparameters
smac = MultiFidelityFacade(
scenario,
dummy,
intensifier=intensifier,
overwrite=True,
)
incumbent_config = {}
incumbent_score = 100000
budget_used = 0
# We can ask SMAC which trials should be evaluated next
for _ in range(126):
info = smac.ask()
trial_info = TrialInfo(info.config, seed=cfg.seed, budget=info.budget)
cost = problem.evaluate(trial_info)
value = TrialValue(cost=cost.cost, time=0.5)
budget_used += info.budget
smac.tell(info, value)
if -cost.cost > -incumbent_score:
incumbent_score = cost.cost
incumbent_config = info.config.get_dictionary()
log_dict = {}
log_dict["config"] = incumbent_config
log_dict["score"] = incumbent_score
log_dict["budget_used"] = budget_used
with Path("incumbent.jsonl").open("a") as f:
json.dump(log_dict, f)
f.write("\n")
if __name__ == "__main__":
run_carps()
Actually my results probably have nothing to do with ask-tell after all - the same happens for TF execution mode. So either this issue is unrelated to #1131 or this bug also shows up for the standard TF execution.
Regarding #1131:
When using the hypersweeper v.0.2.0 and SMAC v.2.1.0 with the default max_parallelization=0.1
, n_trials=301
, min_budget=100_000
, max_budget=10_000_000
, and eta=2
, we would expect these brackets:
--------------------------------------------------------------------------------
Stage 0
#Configs: [ 64, 32, 16, 8, 4, 2, 1]
Budgets: [ 156250, 312500, 625000, 1250000, 2500000, 5000000, 10000000]
--------------------------------------------------------------------------------
Stage 1
#Configs: [ 38, 19, 9, 4, 2, 1]
Budgets: [ 312500, 625000, 1250000, 2500000, 5000000, 10000000]
--------------------------------------------------------------------------------
But we got this, which is really strange: (I put this into brackets and stages but this doesn't mean it's the brackets and stages SMAC used internally)
--------------------------------------------------------------------------------
Stage 0
#Configs: [ 42, 42, 30, 21, 8]
Budgets: [ 625000, 1250000, 2500000, 5000000, 10000000]
--------------------------------------------------------------------------------
Stage 1
#Configs: [ 64, 38, 8]
Budgets: [ 156250, 312500, 10000000]
--------------------------------------------------------------------------------
Stage 2
#Configs: [ 32, 16]
Budgets: [ 312500, 625000]
--------------------------------------------------------------------------------
Not sure if this is related though
That looks like a hypersweeper problem, seems like the scenario(?) isn't instantiated correctly? I took a quick look into my hypersweeper vs non-hypersweeper ask-tell brackets locally and those look to be the same with the SMAC default settings (as does the version with TF execution). So I think that's a matter of correctly creating the SMAC components. Could you make an issue for that over there?
I'll run a few more things soon anyway and try to look into checking what happens with changing eta, but I'd be surprised if this is a SMAC bug.
Okay, apparently the issue was caused by passing seed=None
to the tell()
method, which caused SMAC not to be able to match the configurations with the ones provided through ask()
. So it was a bug from my side :D
This is still somewhat confusing to me since I set deterministic=True
so I was not expecting SMAC to use seeds at all. It would be great to have some sort of check for this since it's very hard to trace back the issues as it can lead to a different hyperband behaviour without letting the user know
Glad that it works! I have created a new issue to improve documentation and add a check. Will close this. Feel free to open again if issues arise.
Description
I want to use the ask/tell-interface for multi-fidelity HPO. Particularly, I am using the hypersweeper package to run distributed HPO. However, I noticed that the number of configurations per budget did not match the Hyperband/successiveHalving schedule.
These are the Hyperband brackets:
However, the sweeper executed
64
configurations on budget15.625
, followed by22
configurations on budget62.5
, instead of16
configurations on budget62.5
.After some debugging, I assume the bug is caused by this snippet: https://github.com/automl/SMAC3/blob/9d194754a5fed3ec48be06987cfc24ee99b76af5/smac/intensifier/successive_halving.py#L542-L545
Apparently, the
seed
field for theisb_keys
are alwaysNone
whereas theseed
fields forfrom_keys
are filled with random seeds. Therefore, the bracket is skipped and the configurations from the next brackets are used.With this workaround, I achieve the expected behaviour:
Steps/Code to Reproduce
examples/configs/mlp_smac.yaml
setpython examples/mlp.py --config-name=mlp_smac -m
tmp/mlp_smac/runhistory.csv
Expected Results
64
configs on budget15.625
,16
configs on budget62.5
,4
configurations on budget250
,1
configuration on budget1000
Actual Results
64
configs on budget15.625
,22
configs on budget62.5
,8
configurations on budget250
,4
configurations on budget1000
This indicates that only the first stage of each bracket is evaluated.
Versions
Hypersweeper from https://github.com/becktepe/hypersweeper SMAC==2.2.0