Yelp / MOE

A global, black box optimization engine for real world metric optimization.
Other
1.31k stars 140 forks source link

[Bug] python/python_version/optimization.py #443

Closed yf275 closed 9 years ago

yf275 commented 9 years ago

A bug exists in python/python_version/optimization.py at Line 655 and 656:

shaped_point = point.reshape(self._num_points, self.domain.dim) self.objective_function.current_point = shaped_point

where point is an numpy.ndarray of shape (self.domain.dim, ) and self._num_points is defined to be 1. Line 656 triggers Line 233 in python/python_version/log_likelihood.py:

current_point = hyperparameters

which in turn invokes function set_hyperparameters() in Line 223 of the same file and further set_hyperparameters() in Line 63 of python/python_version/covariance.py, setting self._hyperparameters to shaped_point and its shape to (1, self.domain.dim). Note that self.domain.dim is equal to (1 + # of columns of input data). The function set_hyperparameters() in covariance.py also defines:

self._lengths_sq = numpy.copy(self._hyperparameters[1:])

This means self._lengths_sq would be equal to [] and raise a ValueError when Line 99 of the same file is executed:

temp /= self._lengths_sq

I proposed that that we should delete Line 655 of python/python_version/optimization.py and change Line 656 of the same file to:

self.objective_function.current_point = numpy.copy(point)

This will keep the shape of self._hyperparameters as (1 + # of columns of input data, ) and fix the bug.

jialeiwang commented 9 years ago

I do not think self.domain.dim in optimization.py is the dimension of the problem when you are using it to optimize hyperparameters, it will be the dimension of the hyperparameter space. Note that optimization interface is independent from everything else, and therefore domain is not necessary the domain of the global optimization objective, it could be domain of hyperparameter space or whatever.

jialeiwang commented 9 years ago

This is not a bug.

suntzu86 commented 9 years ago

Sorry I've been gone for so long! See: #448 for details.

I agree with @jialeiwang here. @yf275, do you have a stacktrace from a failure? As jialei pointed out, the terms "domain" and "point" are overloaded. In the expected improvement setting, they're the physical space you're optimizing in. In the hyperparameter/log likelihood setting, they're the hyperparameter space (e.g., 1 + spatial_dim).

iirc, numpy.copy(point) on the line you referenced doesn't work b/c sometimes we have to flatten the point to match some of scipy's optimizers' expected inputs. I believe COYBLA has this issue.