GP for functions with 2D input

filipporemonato commented 4 months ago

Toward the end of this previous issue, I wondered about the possibility of constructing a GP to optimise scalar functions with two-dimensional inputs. According to the reply I got, it should have been pretty much the same structure as for the 1D case, but after giving it a decent college try, I don't seem to have succeded.

My final aim is to implement Gaussian optimisation of a scalar function which takes 2D input.

MWE:

using AbstractGPs
using Random

# Just a toy test function f : R^2-->R with several peaks and a maximum near [1,1]
function test_fun_2d(x1, x2)
    y = @. x1^2 * (sin(5*pi*x1))^6 + x2^2 * (sin(5*pi*x2))^6
    return y
end

# Fix seed for reproducibility
Random.seed!(99)

n_samples = 10

x1 = rand(Float64, n_samples)
x2 = rand(Float64, n_samples)
y = test_fun_2d(x1, x2)

# Define a zero-mean Gaussian process with a matern-3/2 kernel.
gp = GP(Matern32Kernel())

# Here is my attempt at fitting the above GP on a 2D input
points = @. hcat(x1, x2)
fx = gp(points, 0)

model = posterior(fx, y) # works!

# But this does not work
mean_and_var(model([0.6, 0.6])) # ERROR: DimensionMismatch: dimensionality of x (0) is not equal to that of y (1)

mean(model([0.6, 0.6]) # ERROR: DimensionMismatch: dimensionality of x (1) is not equal to that of y (0)

I initially tried fitting the GP in a more naive way like fx = gp(x1, x2, 0) but of course it didn't work. Hence why I tried creating the array-of-array above, which seemed a sensible approach coming from a Python environment (which I know does not translate to Julia oftentimes, but it was the best guess). Everything works fine until the creation of the posterior model, and both fxand modelseems to have a 2x2 covariance matrix initialised, which is what I expected. But I'll admit I am very fresh with Julia so I may very well be taking a huge blunder here.

But is this the correct way to instantiate a GP for a function with 2 inputs? If yes, how should I go about computing the mean or covariance matrix?

willtebbutt commented 4 months ago

Hi @filipporemonato . Thanks for opening the issue.

The right thing to do for multi-dimensional inputs is to put them into either a ColVecs or a RowVecs. You can find more info on these in the KernelFunctions docs on input types.

Firstly, your inputs:

X = hcat(x1, x2) # create an `N x D` matrix
x = RowVecs(X) # tell AbstractGPs that `X` is `N x D`, not `D x N`
fx = gp(points, 0)

Then you should make predictions at collections of points. If you want to make predictions at a single point, you must make predictions at a collection of points of size 1:

X_pr = [0.6 0.6] # a matrix of size 1 x 2
x_pr = RowVecs(X_pr)
mean_and_var(model(x_pr))

Hope this helps. Please do ask more questions if anything remains unclear!

filipporemonato commented 4 months ago

Hei @willtebbutt thank you so much for the good reply! Great to see that the intuition of constructing the NxD matrix was correct, but I should have used RowVecs wrapping to get there, instead of manually making it as I did above with the broadcasting.

Your example works and I get the mean and var out, thank you. One thing I am a bit confused about now, is that since we are working with a 2D process, I was expecting mean_and_var(model(x_pr)) to return a 2-element vector for the mean, and a 2x2 matrix for the covariance, while it returns 2 scalars. Am I wrong?

P.S. Thank you for the pointer to the docs in KernelFunctions! I should get better at perusing the docs for info, indeed. If I may advance a suggestion, I think it would be really convenient, to have the possibility of searching in all the related docs, or at least be able to access the docs of a sub-package (KernelFunctions) from the docs of the main package (like AbstractGPs in this case). Cheers!

willtebbutt commented 4 months ago

One thing I am a bit confused about now, is that since we are working with a 2D process, I was expecting mean_and_var(model(x_pr)) to return a 2-element vector for the mean, and a 2x2 matrix for the covariance, while it returns 2 scalars. Am I wrong?

Ah, yes, this is not quite right. Your inputs are two-dimensional, but the output is still 1D. i.e. the GP you're now using models functions from R^2 -> R. Is this not what you're after?

filipporemonato commented 4 months ago

Sorry, I got confused by hopping between one project and another. You are completely right, my function is R^2 --> R and so both the expected value and variance are scalars. If you have a way to tackle the R^2 --> R^2 case maybe it would still be nice to add it to this issue though; I believe people who might be googling for that problem could bump into this thread, so it would be helpful. Other than than, thank you again for the great help, we can close this issue from my side.

willtebbutt commented 4 months ago

Great, thanks.

For future reference, those interested in the R^2 --> R^2 case should consult the KernelFunctions multi output docs.

JuliaGaussianProcesses / AbstractGPs.jl

GP for functions with 2D input #389