Yelp / MOE

A global, black box optimization engine for real world metric optimization.
Other
1.3k stars 139 forks source link

(Expected improvement not finite. Variance matrix may be singular) when num_to_sample > 1 #453

Open rkrav opened 8 years ago

rkrav commented 8 years ago

I'm posting the following JSON to gp/next_points/epi: {"domain_info": {"dim": 1, "domain_bounds": [{"max": 1.0, "min": 0.0}]}, "gp_historical_info": {"points_sampled": [{"value_var": 0.01, "value": 0.1, "point": [0.0]}, {"value_var": 0.01, "value": 0.2, "point": [1.0]}]}, "num_to_sample": 2} and it returns an error of

Expected improvement not finite. Variance matrix may be singular.

Is this expected behavior?

suntzu86 commented 8 years ago

I believe if you don't specify hyperparameters (starting to think we really should not have allowed users to do that), they all default to 1.0.

Your domain length is 1.0, so that means the GP predictor will look roughly like a straight line connecting the end points. When you try to find the best expected improvement for 2 simultaneous points, they'll both be out of bounds (and snapped back in bounds), which can cause the exact-EI computation to fail (when multiple points_to_sample are identical). They go out of bounds b/c the GP expects essentially 0 improvement from sampling inside the domain.

Setting the hyperparam length scale to say 0.1 should not have this issue.

But unless I'm missing something, that error comes from: https://github.com/Yelp/MOE/blob/97949e56c3851a5a232cc53b1e54643a5a1460b8/moe/optimal_learning/python/python_version/expected_improvement.py#L381 and should be handled here: https://github.com/Yelp/MOE/blob/97949e56c3851a5a232cc53b1e54643a5a1460b8/moe/views/gp_next_points_pretty_view.py#L145

So I'm confused as to why it would be failing outright...

rkrav commented 8 years ago

What does this hyperparameter mean here? By scale, do you mean theta in the prior covariance function? I thought that was fit from the data a la empirical bayes..

Does this mean that proper operating procedure is to first call gp_hyper_opt, retrieve the fitted hyperparameters, and then pass them to gp_next_points?

suntzu86 commented 8 years ago

like

cov(x,y) = \alpha * exp(\sum_i={1..d} [-0.5 * (x[i] - y[i])^2 / L[i]^2] )

the hyperparameters are [\alpha, L[0], L[1], ..., L[d-1]]. Those are often labeled theta.

Answer to your last question is Yes :) Detail: Yeah these would be fit from the data. MOE offers that functionality through gp_hyper_opt. That is not built into get_next_points b/c you may not want to re-tune hyperparameters after every trial (b/c it's relatively expensive). So you have to specify hyperparam values o/w you end up with some very arbitrary defaults (1.0? I dont remember).

That said, keep a few things in mind:

robsmith11 commented 5 years ago

@suntzu86 's explanation makes sense, but I'm still getting the same error with the original example after adding the following to the request:

"covariance_info":{"covariance_type":"square_exponential","hyperparameters":[1.0,0.1]}

What else is required to perform multiple samples?