claesenm / optunity

optimization routines for hyperparameter tuning
http://www.optunity.net
Other
416 stars 78 forks source link

fails with "a float is required" #75

Open simonm3 opened 8 years ago

simonm3 commented 8 years ago

I am optimising an xgboost classifier and it works fine for about 20 or so iterations. Then it fails with "a float is required" on line 300 of "search_spaces.py".

I printed k and v in your source and on each iteration it shows k="booster" and v between 0 and 2. Looks like it chooses gblinear if < 1 and gbtree if > 1. However on the final iteration it shows v="gbtree" and fails when it does math.floor on it.

I simplified the search_space to {'gamma': [0.0, 1], 'booster': {'gblinear': None, 'gbtree': None}}

simonm3 commented 8 years ago

Here is the trace:

C:\Users\s\Anaconda3\lib\site-packages\optunity\api.py in maximize_structured(f, search_space, num_evals, pmap) 368 solver = make_solver(**suggestion) 369 solution, details = optimize(solver, f, maximize=True, max_evals=num_evals, --> 370 pmap=pmap, decoder=tree.decode) 371 return solution, details, suggestion 372

C:\Users\s\Anaconda3\lib\site-packages\optunity\api.py in optimize(solver, func, maximize, max_evals, pmap, decoder) 256 257 # TODO why is this necessary? --> 258 if decoder: solution = decoder(solution) 259 260 optimum = f.call_log.get(**solution)

C:\Users\s\Anaconda3\lib\site-packages\optunity\search_spaces.py in decode(self, vd) 301 print(k) 302 print(v) --> 303 option_idx = int(math.floor(v)) 304 option = content[option_idx] 305 result[DELIM.join(keylist[len(currently_decoding_nested):])] = option

TypeError: a float is required

simonm3 commented 8 years ago

OK. Found the problem. You have two optimising functions called in api.py/optimize. At line 244 it returns solution with a number and the other returns a solution with the text of the categorical feature. So when you call decode sometimes it will work and sometimes not. Also when it fails not only the parameters are wrong but also the optimum is reported as None.

Here is a solution which I think works. Will be running on my data and post back if I find any problems:

time = timeit.default_timer()
try:
    solution, report = solver.optimize(f, maximize, pmap=pmap)
except fun.MaximumEvaluationsException:
    # early stopping because maximum number of evaluations is reached
    # retrieve solution from the call log
    report = None
    if maximize:
        index, _ = max(enumerate(f.call_log.values()), key=operator.itemgetter(1))
    else:
        index, _ = min(enumerate(f.call_log.values()), key=operator.itemgetter(1))
time = timeit.default_timer() - time

call_dict = f.call_log.to_dict()
index = call_dict["values"].index(max(call_dict["values"]))
optimum = call_dict["values"][index]
solution = {k: v[index] for k, v in call_dict["args"].items()}

num_evals += len(f.call_log)

# use namedtuple to enforce uniformity in case of changes
stats = optimize_stats(num_evals, time)

return solution, optimize_results(optimum, stats._asdict(),
                                  call_dict, report)
claesenm commented 8 years ago

Thanks, I will check it out over the next few days!