Avoiding Cold Start Problem in Bayesian Optimization

HIPS / Spearmint

Spearmint Bayesian optimization codebase

Other

1.55k stars 329 forks source link

Avoiding Cold Start Problem in Bayesian Optimization #118

Closed zaouk closed 5 years ago

zaouk commented 5 years ago

Hi there! Does anyone know if there's a way to fit the GP to some manually collected training points (hyperparams -> objective) before it starts making its suggestions?

I need to let Spearmint recommend some parameters by taking into consideration the value of the objective seen for some offline (hyperparameter, objective value) data, as well as the newly points seen by the objective.

I tried to look at the records of the collected points stored in MongoDB, but since objects with specific ids are inserted in MongoDB, I thought it wouldn't be that straightforward to put the offline collected data in MongoDB before calling Spearmint...

mgelbart commented 5 years ago

A general problem with Spearmint is that it calls your function instead of you asking it for suggestions. This turned out to be a poor design decision for many reasons. You could try another tool, like scikit-optimize, which allows for either paradigm. If you're just asking the BayesOpt package for suggestions, you can feed it whatever data you want before you start -- it won't know the difference between data you collected by evaluating your function or data you're just feeding in because you already had it lying around.

zaouk commented 5 years ago

Thanks @mgelbart for your reply! Actually I've coded a workaround to support training on offline data (instead of starting with no training points for the GP) by adjusting the main.py file. My version is under https://github.com/zaouk/Spearmint/blob/master/spearmint/mymain.py. The trick I used appends the offline data to task_group.input and task_group.values everytime the function load_task_group is called. (https://github.com/zaouk/Spearmint/blob/master/spearmint/mymain.py#L487-L517)