Yelp / MOE

A global, black box optimization engine for real world metric optimization.
Other
1.31k stars 140 forks source link

What is noise_variance in SamplePoint about? #468

Open swyoon opened 7 years ago

swyoon commented 7 years ago

First of all, I am really amazed by the wonderful, neat interface of MOE. I appreciate it.

I don't understand why there are noise_variance attribute in SamplePoint.

In other words, why does every datapoint have their own noise value? I don't think it is about multi-fidelity setting. If so, it would be truly amazing though.

So, what does noise_variance assigned to each data point mean, and what value should I plug in to it?

RokoMijic commented 7 years ago

why does every datapoint have their own noise value?

I think it is because MOE allows each datapoint to be uncertain, with the amount of uncertainty varying around your parameter space.

and what value should I plug in to it?

I think a first pass approach would be to plug the same number in for every point. You could get this number by calling the objective function 3-5 times at one point and using an unbiased estimator of population variance on that sample. You could improve upon this by sampling 3-5 times from say 3 different locations around your parameter space. Obviously more is better but I am assuming your function is expensive to evaluate. There are probably more sophisticated ways to estimate this parameter, especially if you have a very good idea about the appropriate covariance function and length scale.

Another possible use of noise_variance assigned to each data point is a way of hacking prior beliefs into MOE. You could cover the space with fake points, each having a relatively high noise_variance, whose value is equal to your prior belief about the value of the objective function. Then the real samples from your function could use the estimated variance, which should be something like 10x lower or better.