automl / RoBO

RoBO: a Robust Bayesian Optimization framework
BSD 3-Clause "New" or "Revised" License
481 stars 133 forks source link

Error: lmb should not be infinite #45

Closed liamcli closed 7 years ago

liamcli commented 7 years ago

I am using Fabolas to optimize the hyperparameters of a 3 layer CNN on CIFAR-10. In one of the runs I performed, I encountered "lmb should not be infinite" in the routine that optimizes the acquisition function. Unfortunately, this error only occurs occasionally and may be hard to reproduce.

I'm hoping to have the code I used to run the experiments publicly available in the next few days. This will provide information on the search space and model I'm optimizing.

Traceback (most recent call last): File "fabolas_wrapper.py", line 163, in main(sys.argv[1:]) File "fabolas_wrapper.py", line 127, in main searcher.run() File "fabolas_wrapper.py", line 89, in run fabolas_fmin(self.objective_function,X_lower,X_upper,5030000/100) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/fmin.py", line 273, in fabolas_fmin x_best = bo.run(num_iterations) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/solver/fabolas.py", line 218, in run new_x = self.choose_next(self.X, self.Y, self.C, do_optimize) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/solver/fabolas.py", line 332, in choose_next x = self.maximize_func.maximize() File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/maximizers/cmaes.py", line 82, in maximize "maxfevals": self.n_func_evals}) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/cma.py", line 5511, in fmin aggregation=np.median) # treats NaN with resampling File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/cma.py", line 3480, in ask_and_eval f = func(x, args) if kappa == 1 else \ File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/maximizers/cmaes.py", line 52, in _l return -acq_f(x, *args, **kwargs)[0] File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/base_acquisition.py", line 79, in call acq = [self.compute(x[np.newaxis, :], derivative) for x in X] File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/integrated_acquisition.py", line 126, in compute derivative=derivative) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/base_acquisition.py", line 79, in call acq = [self.compute(x[np.newaxis, :], derivative) for x in X] File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/information_gain_per_unit_cost.py", line 112, in compute derivative=derivative) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/information_gain.py", line 125, in compute acq = self.dh_fun(X, derivative=False) File "/home/lisha/school/Projects/fabolas/fabo/lib/python2.7/site-packages/robo-0.1-py2.7.egg/robo/acquisition/information_gain.py", line 208, in dh_fun "lmb should not be infinite.")

aaronkl commented 7 years ago

Hmm it's a bit hard to say what happened here just by looking at the error message. Could you turn on the logging and upload the logging output of Fabolas? That would help to see what goes wrong here. You can simple turn on the logging by adding the two following lines at the top of your python script:

import logging logging.basicConfig(level=logging.DEBUG)

cheers Aaron

liamcli commented 7 years ago

Here's the log file, I think it was run with the lowest level of logging. trial9000_log.txt

liamcli commented 7 years ago

I've made the repos I used to run this experiment public. I ran this using the fabolas wrapper in https://github.com/lishal/hyperband_benchmarks. If you follow the directions and setup the environment with the indicated version of CUDA, cudnn, and caffe, you can replicate this error by running python fabolas_wrapper.py -m cifar10 -i (data dir) -o (output dir) -s 9000 -d (gpu device number).

aaronkl commented 7 years ago

Hi, I am travelling right now but I will have a closer look as soon as I am back. In the meantime could you check out the current master branch and try it again with the new version of fabolas? Maybe that already solves the issue

aaronkl commented 7 years ago

It seems that there was something broken with george and the latest numpy version. Pulling our latest george fork and RoBO should fix the problem