JuliaGaussianProcesses / Stheno.jl

Probabilistic Programming with Gaussian processes in Julia
Other
336 stars 26 forks source link

Simple linear algebra for CompositeGP construction #216

Open ElOceanografo opened 2 years ago

ElOceanografo commented 2 years ago

Would it be desirable to have methods for dot and matrix multiplication to form linear combinations of vectors of GPs? This could make it much easier to programmatically construct composite GPs for things like spatial factor analysis. MWE is below; I can turn this into a PR if wanted...

using Stheno, AbstractGPs
import LinearAlgebra: dot
import Base: *

dot(x::Vector{T}, ff::Vector{G}) where {T<:Number,G<:AbstractGPs.AbstractGP} = sum(x .* ff)
*(A::Matrix{T}, ff::Vector{G}) where {T<:Number,G<:AbstractGPs.AbstractGP} = Stheno.cross([dot(A[i, :], ff) for i in 1:size(A, 1)]) 

f1 = @gppp let
    ff = [GP(SEKernel()) for _ in 1:2]
    a = [0.5, 2.0]
    f3 = dot(a, ff)
end

f2 = @gppp let
    ff = [GP(SEKernel()) for _ in 1:2]
    A = [1.0 2.0; 3.0 4.0]
    f3 = A * ff
end
willtebbutt commented 2 years ago

This seems like a good idea. I think your proposal for the definition of dot is correct, but I think the multiplication one ought to just produce a vector of GPs, rather than a single GP.

Either way, very happy to review a PR that implements this stuff.

ElOceanografo commented 2 years ago

Ok, I wasn't totally sure about using cross for the matrix-multiplication case myself. The use case I'm picturing is something like the second model in the readme (with one latent GP and two outputs). I'd like to be able to do something like this:

f = @gppp let
    # Define a smooth latent process that we wish to infer.
    f = GP(SEKernel())

    # Define the two noise processes described.
    g = x->sin.(x) .- 5.0 .+ sqrt.(abs.(x))
    noise1 = sqrt(1e-2) * GP(WhiteKernel()) + g
    noise2 = sqrt(1e-1) * GP(3.5, WhiteKernel())

    # Define the processes that we get to observe.
    # These two lines are from the example
    # y1 = f + noise1
    # y2 = f + noise2
    # These two lines are supposed to do the same thing, conceptually 
    A = [1 1 0;  1 0 1]
    y = A * [f, noise1, noise2]
end;

The goal would be making it easier to define models with arbitrary numbers of latent and observed processes, without having to manually define y1, y2, ...yn.