dfm / george

Fast and flexible Gaussian Process regression in Python
http://george.readthedocs.io
MIT License
451 stars 128 forks source link

Constraint the GP fit to be strictly positive #106

Open maria-vincenzi opened 6 years ago

maria-vincenzi commented 6 years ago

Hi, I have a question (sorry if this is not strictly related to the code implementation itself). Is there any way to constraint the GP fit to be strictly positive? Let's say I'm trying to fit some physical quantity that have to be positive (e.g. Flux). Is there a way to prevent the GP to go below zero? The only solution I came up with was to perform the GP fit in LOG space but this is not performing very well (I get smooth fits in LOG space, but when I then transform everything back to LIN the fit look a little bit "messy"). Thank you, Maria

dfm commented 6 years ago

Unfortunately the only solution that I know of is the one you've already tried. There isn't anything else that you can do and still retain the analytic properties of a gaussian process. Depending on what you want and what you mean by "messy", you might be able to use a different kernel that does what you want, but I don't have any specific suggestions.

On Sat, Nov 24, 2018 at 4:06 AM Maria Vincenzi notifications@github.com wrote:

Hi, I have a question (sorry if this is not strictly related to the code implementation itself). Is there any way to constraint the GP fit to be strictly positive? Let's say I'm trying to fit some physical quantity that have to be positive (e.g. Flux). Is there a way to prevent the GP to go below zero? The only solution I came up with was to perform the GP fit in LOG space but this is not performing very well (I get smooth fits in LOG space, but when I then transform everything back to LIN the fit look a little bit "messy"). Thank you, Maria

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dfm_george_issues_106&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=AvKCBU-zyph1qdf6NaatMQ&m=mSH1PlRHIyo0bTk9FtasGMDfx1GaPvPPWUfhzaeyvhE&s=tXirdET1gc1LsSaIrGmlxLIfoAN1iksXjjhMWvNowlY&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAVYSlVURqg6bbXMDsUSJxH2hhfegy7Aks5uyQwQgaJpZM4YxVtx&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=AvKCBU-zyph1qdf6NaatMQ&m=mSH1PlRHIyo0bTk9FtasGMDfx1GaPvPPWUfhzaeyvhE&s=O6CbUIMWDjFe_bvq0zAZkAQe4qlKbGCa7GJ4sxqVi7E&e= .

-- Dan Foreman-Mackey Associate Research Scientist Flatiron Institute http://dfm.io

davidwhogg commented 6 years ago

It feels like a constrained-positive solution is possible -- because it is for any finite-parameter model. But I don't know how to do it!

tmcclintock commented 5 years ago

I just wanted to chime in that this is possible by transforming the Gaussian posterior predictive distribution to a truncated Gaussian PPD. In other words, one can train the GP on the normal training data, but then just plug the moments of the GP (the mean and cov) into the truncated Gaussian distribution along with the bounds and recover a PDF over the truncated range. In the case of requiring y>0 you can set the lower bound on the GP to be a=0. This is a cute solution too because your truncation bounds can vary with input x.

You just have to be careful since the +-68% confidence intervals (the sigma levels) are not symmetric about the mean prediction, and furthermore the mean prediction is no longer the median of the PPD. It is straightforward to compute the median of the truncated Gaussian, though.

Here is an example figure showing the tutorial GP with a truncated PPD to exist only in y>0. The black points, green line, and gray lines are all the same as the tutorial. The red line shows the mean of the truncated GP PPD and the red shaded regions show the 68% confidence interval (not the standard deviation).

example

@dfm @davidwhogg if you think this might be worth a PR I'd be happy to implement it.

lleewwiiss commented 4 years ago

Also something worth considering https://arxiv.org/abs/2004.04632.pdf

tmcclintock commented 4 years ago

@lleewwiiss good find. It turns out that because of the nice properties of Gaussians, you can construct even more sophisticated constrained GPs (like ones where the slope or convexity of the GP is bounded). Here is a nice recent stats paper on the topic https://arxiv.org/abs/1703.00787.

lleewwiiss commented 4 years ago

@tmcclintock Interesting I will check it out, do you have an example of how you implemented the above constraints? I tried by can't seem to work it out

tmcclintock commented 4 years ago

Oh the ones in the paper? I do not. Or did you mean the one I commented about in November?

lleewwiiss commented 4 years ago

The comment from November using the tutorial example

tmcclintock commented 4 years ago

Ah I see. Not on hand, but I'll whip up a gist and ping you.

scipy.stats.truncnorm is very unwieldy and not amenable to working with a GP since the loc and scale arguments need to change over the domain of the prediction, hence a vectorized implementation might be nice to make.

tmcclintock commented 4 years ago

@lleewwiiss here you go: https://github.com/tmcclintock/TruncatedGP It ended up being too long for a gist. I show the posteriors for the truncated and non-truncated GPs, as well as draws from both posteriors.

If you look at the code you will see that you need to do some annoying transformations because of the way scipy.stats.truncnorm works. The upside is that my implementation allows for arbitrary bounds a(x) and b(x).

Drop me a line if you feel like spinning this out into a real project.

lleewwiiss commented 4 years ago

@tmcclintock Thank you! I will check it out now, and credit you where I can