Closed dma092 closed 4 months ago
I tend to try hp.normal()
when I already have a set of hyperparameters that work well - what sort of improvement are you thinking of?
@dma092 Yes it is. I think you mean this here:
points_to_evaluate : list, default None
Only works if trials=None. If points_to_evaluate equals None then the
trials are evaluated normally. If list of dicts is passed then
given points are evaluated before optimisation starts, so the overall
number of optimisation steps is len(points_to_evaluate) + max_evals.
Elements of this list must be in a form of a dictionary with variable
names as keys and variable values as dict values. Example
points_to_evaluate value is [{'x': 0.0, 'y': 0.0}, {'x': 1.0, 'y': 2.0}]
From here: https://github.com/hyperopt/hyperopt/blob/master/hyperopt/fmin.py#L276
I tend to try
hp.normal()
when I already have a set of hyperparameters that work well - what sort of improvement are you thinking of?
@jarednielsen The problem with using hp.normal()
is now the search space is unconstrained. I want to be able to add in some prior information for any sampling scheme (I am not sure if this is what the original commenter intended, but this seems like a useful use-case to me).
Something like hp.uniform('x', 0, 10, bootstrap=[(2, 45), (8, 56)])
where a tuple of the form (p,q)
in the bootstrap
parameter denotes f(p)=q
. Here f
is the function to be minimized.
I see two uses for this:
f
is expensive to compute.NB: Looking at the link @PhilipMay shared it looks like part of what I suggest above should be possible, albeit using a different parameter - points_to_evaluate
. You can ask the optimizer to start with evaluating f
on the specific set of points provided in points_to_evaluate
. I am not sure why trials=None
is needed for this though.
Looking through the source code, you can pass in a Trials
object with past runs of the code. Extending @abhishek-ghose 's example, it would be a bit more work than the simple tuple (you'd have to set up the JSON config object), but that would allow setting an arbitrary prior for tpe.suggest
. I think we've discovered that this feature is already built-in :)
I looked at this a bit more and I think the function generate_trials_to_calculate()
helps to make initialization work with a Trials
object. This is available from v0.1.1 onwards. I really don't want to do away with Trials
, which seemingly I would need to if I used points_to_evaluate
(see my previous comment); its very handy for understanding what the optimizer does.
Here's some sample code that shows how to use generate_trials_to_calculate()
. My objective function objective()
just squares a scalar input, which we want to minimize within some range of the input parameter [LOW, HIGH]
. The function bootstrap()
takes as input init_size
- the number of points you want the optimizer to evaluate before it begins to call suggest()
. An optional boolean parameter plot_bootstrap_points
decides if you want to plot these bootstrap points on the final curve. It kind of gets cluttered so I prefer setting this flag to False
.
The bootstrap points are uniformly sampled in the range [LOW, HIGH]
.
import numpy as np
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from hyperopt.fmin import generate_trials_to_calculate
from matplotlib import pyplot as plt
import seaborn as sns; sns.set()
LOW, HIGH = -10, 10
def objective(x):
return {
'loss': x ** 2,
'status': STATUS_OK,
}
def bootstrap(init_size, plot_bootstrap_points=False):
trials = Trials()
if init_size > 0:
# as initial values pick points uniformly on the x-axis; we need this as a dict
init_vals = [{'x': i} for i in np.linspace(LOW, HIGH, init_size)]
# this generates a trial object that can be used to bootstrap computation for the optimizer
trials = generate_trials_to_calculate(init_vals)
# since we've a bootstrapped trials object, the number of function evals would be init_size + max_evals
best = fmin(objective,
space=hp.uniform('x', LOW, HIGH),
algo=tpe.suggest,
max_evals=100,
trials=trials)
print "Best x: %0.04f" % (best["x"],)
# get the bootstrap data for plotting
bootstrap_points = [t['misc']['vals']['x'][0] for t in trials][:init_size]
bootstrap_losses = trials.losses()[:init_size]
bootstrap_plot_data = np.asarray(sorted(zip(bootstrap_points, bootstrap_losses)))
# get data reg points not belonging to the bootstrap
tpe_suggest_points = [t['misc']['vals']['x'][0] for t in trials][init_size:]
tpe_suggest_losses = trials.losses()[init_size:]
tpe_suggest_plot_data = np.asarray(sorted(zip(tpe_suggest_points, tpe_suggest_losses)))
fig = plt.figure()
ax = fig.add_subplot(111)
# plot the objective fn
temp = np.linspace(LOW, HIGH, 1000)
ax.plot(temp, [objective(x)['loss'] for x in temp])
# plot the bootstrap values
if plot_bootstrap_points and init_size > 0:
ax.plot(bootstrap_plot_data[:, 0], bootstrap_plot_data[:, 1], marker='o', ls='', color='gray' )
# plot the new trial values
ax.plot(tpe_suggest_plot_data[:, 0], tpe_suggest_plot_data[:, 1], 'ro')
# extend the y axis so that nothing gets cut off
ax.set_ylim(bottom=-3)
ax.set_title('bootstrap size=%d' % (init_size,))
plt.show()
if __name__ == "__main__":
bootstrap(10, False)
A sample call to bootstrap()
is shown.
One way to see this works is to try out different values for the init_size
: if the bootstrap sample is small, we would expect the optimizer to work almost normally, exploring most of the space. However, if the sample size is large, we would expect the optimizer to know which regions are most promising and limit its search to those regions.
The output plot shows the points the optimizer tests in red. I tried init_size
values of 10, 100, 1000
. max_evals
was held fixed at 100
. We see that larger values of init_size
do indeed focus the optimizer search near the minima.
Many thanks to Ethan Brown for introducing me to generate_trials_to_calculate()
in this thread.
Thanks for this detailed solution @abhishek-ghose , it looks like you've set out to solve a problem I've been thinking about for a while now!
That said, I'm unable to implement it myself, I wonder if you might have a thought as to why. I suspect I'm not understanding the type of input that generate_trials_to_calculate
should be receiving.
I'm hoping to pass a list of dicts of good "starting point" parameters for tpe.suggest
to work from.
Here's the relevant bits of my code:
params = {
'param_1': hp.quniform('param_1', 20, 160, 20),
'param_2': hp.quniform('param_2', 5, 100, 5),
'param_3': hp.quniform('param_3', 1, 3, 1),
'param_4': hp.quniform('param_4', 0.2, 3.0, 0.2),
'param_5': hp.choice('param_5', [0,1]),
'param_6': hp.choice('param_6', [0,1]),
'param_7': hp.choice('param_7', [0,1]),
'param_8': hp.choice('param_8', [0,1]),
'param_9': hp.choice('param_9', [0.6, 0.7, 0.8])
}
print(init_vals[0]) # init_vals is a list of dicts of hyper parameters I like, see output below
trials = generate_trials_to_calculate(init_vals)
print(trials)
best = fmin(
fn=run_the_strategy,
space=params,
algo=tpe.suggest,
max_evals=100,
trials=trials,
points_to_evaluate=init_vals
)
And my output/error:
{'param_1': 140.0, 'param_9': 0.6, 'param_8': 0, 'param_5': 1, 'param_6': 1, 'param_7': 0, 'param_2': 90.0, 'param_3': 1.0, 'param_4': 2.0}
<hyperopt.base.Trials object at 0x113175828>
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/hyperopt/pyll/base.py", line 868, in rec_eval
int(switch_i)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'type'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "optimizing-the-strategy.py", line 609, in <module>
points_to_evaluate=init_vals
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 367, in fmin
return_argmin=return_argmin,
File "/anaconda3/lib/python3.6/site-packages/hyperopt/base.py", line 635, in fmin
return_argmin=return_argmin)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 385, in fmin
rval.exhaust()
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 244, in exhaust
self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 218, in run
self.serial_evaluate()
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 137, in serial_evaluate
result = self.domain.evaluate(spec, ctrl)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/base.py", line 839, in evaluate
print_node_on_error=self.rec_eval_print_node_on_error)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/pyll/base.py", line 870, in rec_eval
raise TypeError('switch argument was', switch_i)
TypeError: ('switch argument was', <class 'hyperopt.pyll.base.GarbageCollected'>)
@trevorwelch You're welcome!
I think the errors you see are probably a result of:
list
instead of a dict
.choice
space works only with the index and not with the choice values!Here's a modified version of your code:
from hyperopt import fmin, tpe, hp, STATUS_OK
from hyperopt.fmin import generate_trials_to_calculate
def run_the_strategy(args):
print "Current arguments:", args
return {
'loss': sum([i*i for i in args]),
'status': STATUS_OK
}
params = [
hp.quniform('param_1', 20, 160, 20),
hp.quniform('param_2', 5, 100, 5),
hp.quniform('param_3', 1, 3, 1),
hp.quniform('param_4', 0.2, 3.0, 0.2),
hp.choice('param_5', [0,1]),
hp.choice('param_6', [0,1]),
hp.choice('param_7', [0,1]),
hp.choice('param_8', [0,1]),
hp.choice('param_9', [0.6, 0.7, 0.8])
]
init_vals = [{'param_1': 140.0, 'param_2': 90.0, 'param_3': 1.0,
'param_4': 2.0, 'param_5': 1, 'param_6': 1,
'param_7': 0, 'param_8': 0, 'param_9': 0}]
trials = generate_trials_to_calculate(init_vals)
best = fmin(
fn=run_the_strategy,
space=params,
algo=tpe.suggest,
max_evals=10,
trials=trials
)
This runs, and gives me the following output (from the objective function):
Current arguments: (140.0, 90.0, 1.0, 2.0, 1, 1, 0, 0, 0.6)
Current arguments: (80.0, 70.0, 3.0, 3.0, 0, 1, 1, 0, 0.7)
Current arguments: (120.0, 90.0, 2.0, 0.6000000000000001, 1, 1, 1, 1, 0.8)
Current arguments: (140.0, 20.0, 2.0, 2.4000000000000004, 0, 0, 1, 0, 0.8)
Current arguments: (120.0, 65.0, 1.0, 0.2, 1, 0, 0, 0, 0.7)
Current arguments: (20.0, 45.0, 1.0, 0.6000000000000001, 1, 1, 1, 1, 0.8)
Current arguments: (40.0, 10.0, 2.0, 2.4000000000000004, 0, 1, 1, 1, 0.8)
Current arguments: (40.0, 25.0, 2.0, 0.6000000000000001, 1, 1, 1, 1, 0.8)
Current arguments: (60.0, 10.0, 2.0, 3.0, 1, 0, 0, 0, 0.6)
Current arguments: (120.0, 55.0, 2.0, 2.8000000000000003, 1, 1, 1, 1, 0.7)
Current arguments: (160.0, 10.0, 3.0, 1.8, 1, 1, 0, 0, 0.8)
As you would note:
max_evals=10
. The first line are the values we initialized with (except param_9
- see next point). So we know the initial value in trials
is indeed being used. param_9
I am initializing with 0
which is not a legal value; however, the first printed line shows param_9
is 0.6
! It seems initialization works with only the index. If you try initializing with 'param_9': 0.6
the code throws an error.Btw you don't need points_to_evaluate
since you're initializing trials
. I accidentally left that in my code - I'll remove it [done].
Thanks again for your very helpful response. With some tweaks I got it running on my actual code 🎉
Another interesting thing to note: when you bootstrap a parameter space like this, the parameters you pass to the function you're hyperopt'ing will now be a tuple instead of a dict (with my use of hyperopt, it's always been a dict, although perhaps this isn't always the case?). For example, where previously I could pass params['param_1']
internally to my function in order to access the parameter value 160
, now I need to keep track of indices and param names and use params[0]
to access the parameter value at param_1.
This bit is weird - I think initializing a choice space works only with the index and not with the choice values!
Yes indeed, in fact, another error occurs related to use of choice
when hyperopt "switches over" from the initialized params to the search space in your evals. My script was throwing this error after the bootstrap evals had all run:
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 390, in fmin
show_progressbar=show_progressbar,
File "/anaconda3/lib/python3.6/site-packages/hyperopt/base.py", line 639, in fmin
show_progressbar=show_progressbar)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 409, in fmin
rval.exhaust()
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 262, in exhaust
self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/fmin.py", line 211, in run
self.rstate.randint(2 ** 31 - 1))
File "/anaconda3/lib/python3.6/site-packages/hyperopt/tpe.py", line 900, in suggest
print_node_on_error=False)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/pyll/base.py", line 913, in rec_eval
rval = scope._impls[node.name](*args, **kwargs)
File "/anaconda3/lib/python3.6/site-packages/hyperopt/pyll/base.py", line 1076, in bincount
return np.bincount(x, weights, minlength)
TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'
I solved this by swapping all of my choice
for quniform
. Fixes the error and you end up with the same results. It seems like the best thing is to avoid use of choice
if you're going to initialize a parameter space via generate_trials_to_calculate
.
@trevorwelch Great that it works for you now! I have been using Python 2.7 - I forgot it to mention it before - but I am not sure if that changes the argument passing semantics.
On a different note, choice
and quniform
satisfy quite different distributional needs IMO. quniform
is ideal when the input comes from a discrete space where smoothness assumptions still hold true e.g. you could make the assumption that in the space [0,1, 2, ..., 98, 99, 100]
, the function doing well at {98, 99}
implies it is likely to do well at 97
too. With choice
you find the utility of each value in your space independently over multiple evals
- in theory, both could give you the same solution, but for a somewhat smooth function choice
would take longer to get to optimality.
For anyone who met the same issue as @trevorwelch and I did (TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'
) : this error probably occurs when you have below kind of initialization where you give a parameter which is set to choice
with a float number 0.5
.
The way the parameter with set to choice
works is the given init value is an index of the possible values from the value range [2,5]
, thus you can only give int ( index is, of course, an integer). So the way to properly set an initial value for a param which is set to
choiceis to give the desired index, for example,
init_vals = [{ .... 'param_8': 1,}]. In this way, the initial value pf
param_8will be set to
[2,5][1]which is
5` .
Another minor issue is that you should ensure the names are matched, for example in params
if you define 'num_estimators':hp.randint('num_estimators', 1000) + 500
, you should set init_vals
with exactly the same name, you cannot use, say n_estimators
when init it.
Hope this helps.
Example
params = [
hp.quniform('param_1', 20, 160, 20), ...
hp.choice('param_8', [2,5]) # noticed here param_8 is defined with `choice`.
]
init_vals = [{'param_1': 140.0, ...,
'param_8': 0.5, # ----> error throw here, should be an int
}]
trials = generate_trials_to_calculate(init_vals)
This issue has been marked as stale because it has been open 120 days with no activity. Remove the stale label or comment or this will be closed in 30 days.
Is it possible to set intial values to evaluate using hyperopt's TPE? The idea is to feed the algorithm with the baseline's parameters and see if it can improve them.