cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.58k stars 562 forks source link

FixedNoiseGaussianLikelihood with z-scored data #1009

Open Galto2000 opened 4 years ago

Galto2000 commented 4 years ago

Howdy folks,

I am doing some sensor/data fusion using an exact single task regression GP. The targets come from various sensors, each of which has its own 1-sigma uncertainty. I am grouping the targets according to the sensorsY = [y1, y1, ... y1, y2, ...y2, .... yn, yn, ...., yn] for sensors 1 through n.

Then I am grouping my 1-sigma noises as Y_std = [d1, d1, ..., d1, d2, d2, ..., d2, ..., dn, dn, ... dn], where d1 is the 1-sigma noise of sensor 1, d2 the 1-sigma noise of sensor 2, etc.

Then I am z-scoring my data:

Y'= (Y-mean(Y))/std(Y)
X'= (X-mean(X))/std(X)

Then before passing Y_std**2 to the noise of the FixedNoiseGaussianLikelihood constructor, I obviously need to correctly adjust for the target z-scoring done on the targets.

The right way of doing this is the main premise of my thread and is what I am searching for.

I started out with z-scoring the sensor standard deviations using the mean and std of the target data

Y_std' = (Y_std-mean(Y))/std(Y)

However, this doesn't feel right to me, and I get counter intuitive results when I experiment with different settings for the sensor noise values.

I changed this by only dividing the noise by the std of the targets :

Y_std'=Y_std/std(Y)

which appears to work a lot better and both noise and target is undergoing the same scaling.

z-scoring the sensor noise using it's own mean and std didn't feel like the right thing to do, as I was afraid I would loose the correct relationship to the target data (essentially I'd be scaling the targets and the sensor noise by different values which doesn't feel right to me)

I guess I am asking what the right way is for adjusting the sensor noises before passing them off as variances in the FixedNoiseGaussianLikelihood?

Thanks

Galto

Balandat commented 4 years ago

You should probably be z-scoring your data in a stratified fashion, i.e. use sensor-specific means/stdevs. Similarly for inputs X (we usually don’t z-score these but normalize them to be unit cube).

eytan commented 4 years ago

Re: Y_std: any change in the mean won't affect the SD, so to adjust for the scaling of your Ys, you just need to account for the division operation. In particular, since Var(Y/c) = (1/c)^2 Var(Y), sd(Y/c) = sd(Y)/c, which is probably why Y_std/std(Y) is working for you.

On Mon, Jan 6, 2020 at 6:16 PM Max Balandat notifications@github.com wrote:

You should probably be z-scoring your data in a stratified fashion, i.e. use sensor-specific means/stdevs. Similarly for inputs X (we usually don’t z-score these but normalize them to be unit cube).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cornellius-gp/gpytorch/issues/1009?email_source=notifications&email_token=AAAW34KJIQ2RJ7ZXPJEL3NTQ4PQXPA5CNFSM4KDIESVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIHOLYI#issuecomment-571401697, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAW34K7O6EG24VLFDOWGF3Q4PQXPANCNFSM4KDIESVA .

Galto2000 commented 4 years ago

Thanks for your insights @Balandat .

With stratified, do you mean that measurements from sensor 1 are z-scored with their means and stdevs, and so on for sensor 2 through sensor n?

Do I use the empirically derived stdev from the data, or the 1-sigma sensor uncertainty as per sensor specification?

It's still not clear how to "adjust" the noises to be passed into the FixedNoiseGaussianLikelihood; do I also z-score them in a stratified fashion using the corresponding values for mean and stdev?

If I z-score my targets by the sensor specified uncertainty, I could probably set the noises in FIxedNoiseGaussianLikelihood to all ones, no?

Unit cube makes a lot of sense for X.

Thanks

Galto

Galto2000 commented 4 years ago

@eytan , yes, that was exactly my "intuition" for not subtracting the mean from Y_std, but you provided me with the proof, thanks!

gpleiss commented 4 years ago

@Galto2000

With stratified, do you mean that measurements from sensor 1 are z-scored with their means and stdevs, and so on for sensor 2 through sensor n?

That should be more stable, yes.

Do I use the empirically derived stdev from the data, or the 1-sigma sensor uncertainty as per sensor specification?

You should use the 1-sigma sensor uncertainty. The noise that you supply to the FixedNoiseGaussianLikelihood should only represent the observational noise (i.e. the sensor noise). If you set the noise to be the empirical stdv, then you're basically saying that all the signal you have is noise, and your model won't learn anything.

If I z-score my targets by the sensor specified uncertainty, I could probably set the noises in FIxedNoiseGaussianLikelihood to all ones, no?

See above. Your model won't learn anything interesting if you do this.

As another note, how is the modeling performance if you don't use a FixedNoiseGaussianLikelihood? You might be able to get reasonable performance with the standard GaussianLikelihood.

Galto2000 commented 4 years ago

@gpleiss

Thanks for your advice.

I have a few more follow up questions if you don't mind.

Question 1:

If I use a standard GaussianLikelihood won't I loose influence on how I weigh my sources? For instance, I have a good sensor with low variance and a bad one with high variance (but I get more data from it). Like with a Kalman filter, I'd like to be able to assign more certainty to my good sensor and less so to my bad sensor. That's how I currently interpret sensor fusion using GPs and the use of FixedNoiseGaussianLikelihood. How would I achieve this with a standard GaussianLikelihood?

Question 2:

My (recently acquired) understanding on GP regression comes almost entirely from Rasmussen's book. In equation 2.24, (shown below for your convenience)

COV(f*) = K(X*,X*) - K(X*,X)[K(X,X) + s_n^2I]^-1K(X,X*)

is s_n^2, i.e. the variance of the noise, the same noise we are talking about for the FixedNoiseGaussianLikelihood?

Balandat commented 4 years ago

If I use a standard GaussianLikelihood won't I loose influence on how I weigh my sources?

Yes, that's true. I guess Geoff is wondering to what extent that's necessary (if the sensors aren't all that different it may not be).

is s_n^2, i.e. the variance of the noise, the same noise we are talking about for the FixedNoiseGaussianLikelihood?

Yes, it is.

Galto2000 commented 4 years ago

Thanks @Balandat

Gotcha, so yes, in my case, there are very good reasons to "weigh" one data source over another :) (or so, at least I think so).