Closed cscherrer closed 1 year ago
It's not necessary to compute it if you don't call rand
or logpdf
:shrug:
One approach that doesn't touch the structure of FiniteGP
would be to add a method MvNormal(::FiniteGP)
that returns the corresponding MvNormal
distribution which could then be used instead of FiniteGP
if you don't want to recompute the cholesky decomposition, e.g., if you call rand
or logpdf
multiple times.
It's not necessary to compute it if you don't call
rand
orlogpdf
shrugOne approach that doesn't touch the structure of
FiniteGP
would be to add a methodMvNormal(::FiniteGP)
that returns the correspondingMvNormal
distribution which could then be used instead ofFiniteGP
if you don't want to recompute the cholesky decomposition, e.g., if you callrand
orlogpdf
multiple times.
Ooh, I like that. Maybe we have a @require
in MeasureTheory that adds the MvNormal
method. Then we can use the faster representation too, should work out great!
It's not necessary to compute it if you don't call
rand
orlogpdf
shrug
Just to be sure I understand, what else would you do with a FiniteGP
? I can think of kernel composition, but doesn't that come before this? Maybe there are other operations?
E.g., it is not required if you compute the marginals. Or more generally just the mean and covariance matrix.
Maybe we have a @require in MeasureTheory that adds the MvNormal method
I don't think this should be done, this would be type piracy (and use Requires...). It belongs into AbstractGPs if we think it is a good idea.
Is it clear that I'm talking about MeasureTheory.MvNormal
? Not seeing the type piracy here. Though having AbstractGPs use MeasureTheory would be great if you're open to that
Haha, my suggestion was Distributions.MvNormal
since FiniteGP
is already a subtype of Distributions.AbstractMvNormal
:smile:
Just to be sure I understand, what else would you do with a FiniteGP? I can think of kernel composition, but doesn't that come before this? Maybe there are other operations?
There are a number of scenarios in which ways to compute logpdf
, rand
, posterior
etc don't utilise the covariance matrix directly. For example, TemporalGPs.jl never computes any covariance matrices, but implements logpdf
, rand
, posterior
, etc.
Haha, my suggestion was
Distributions.MvNormal
sinceFiniteGP
is already a subtype ofDistributions.AbstractMvNormal
smile
Right, I'd want to get to the fast implementation in MultivariateMeasures.jl, and avoid type piracy :)
But come to think of it, maybe it should just be MeasureTheory.MvNormal(::Distributions.AbstractMvNormal)
. Then I wouldn't need Requires
.
BTW, is there a way to avoid using PDMats? I'd rather allow positive semidefinite coavariance.
There are a number of scenarios in which ways to compute
logpdf
,rand
,posterior
etc don't utilise the covariance matrix directly. For example, TemporalGPs.jl never computes any covariance matrices, but implementslogpdf
,rand
,posterior
, etc.
Interesting, does it need to avoid using FiniteGP
to do that?
BTW, is there a way to avoid using PDMats? I'd rather allow positive semidefinite coavariance.
You can use PSDMat
in https://github.com/invenia/PDMatsExtras.jl.
Interesting, does it need to avoid using FiniteGP to do that?
Nah, it has precisely the same API as usual (a big motivation for the design). We just write specialisations like
logpdf(::FiniteGP{<:MySpecialGPType}, ::AbstractVector{<:Real})
rand(::AbstractRNG, ::FiniteGP{<:MySpecialGPType})
posterior(::FiniteGP{<:MySpecialGPType}, ::AbstractVector{<:Real})
etc.
That makes sense, thanks. The "abstract" part of AbstractGPs wasn't really clear to me - how to overload, etc :)
@willtebbutt wrote a really good introduction to the design in the docs but we should probably point to it much more in the README,
I think I had seen that before, but it's been a while. I should definitely have another look
In any case, should we add a convert(Distributions.MvNormal, ::FiniteGP)
? I'd be happy with that:)
Oh, I completely forgot that I had made a draft locally during the discussion last week. I'll open a PR.
I believe that this is basically sorted, so am going to close. Please re-open if you feel otherwise.
Say you're using GPs in a PPL. The setup might be something like
Then in Soss or Turing, you might have
y ~ ydist
.Digging into the AbstractGPs code, it looks like both
rand
andlogpdf
end up doingSince none of this depends on the actual data, it could be computed once and reused many times during inference. As it is, this adds lots of overhead at each iteration. Is there a reason not to do it this way?