Open filtron opened 2 months ago
Upon some further consideration I have come up with some alternative options. Now it is important to keep in mind that the problem is that we do not own a type for positive semi-definite matrices with the following properties:
Cholesky
)A lot of methods are defined for AffineNormalKernel
such as marginalize
, invert
, bayes_rule
which is defined by the following type restriction
const AffineNormalKernel{T} =
NormalKernel{<:AbstractAffineMap{T},<:CovarianceParameter{T}} where {T}
The consequence is that these methods dont exist for
NormalKernel{<:AbstractAffineMap,<:MyCovarianceType}
even when MyCovarianceType
implements everything that is expected of a covariance parameter.
ConstantConditionalCovariance
Add the following type (name could be something different)
struct ConstantConditionalCovariance{A}
covariance_parameter::A
end
(C::ConstantConditionalCovariance)(x) = C.covariance_parameter
so that AffineNormalKernel
may be defined by
const AffineNormalKernel{A,B} = NormalKernel{A,B} where {A<:AbstractAffineMap,B<:ConstantConditionalCovariance}
Now anyone can construct an AffineNormalKernel
by
NormalKernel(affine_map, ConstantConditionalCovariance(my_special_cov))
HomoskedasticNormalKernel
Add the following type
struct HomoskedasticNormalKernel{A,B} <: AbstractNormalKernel
mean_parameter::A
cov_parameter::B
end
so that AffineNormalKernel
may be defined by
const AffineNormalkernel{A,B} = HomoskedasticNormalKernel{A,B} where {A<:AbstractAffineMap,B}
Now anyone can construct an AffineNormalKernel
by
HomoskedasticNormalKernel(affine_map, my_special_cov)
AffineNormalKernel
Add the following type
struct AffineNormalKernel{A,B} <: AbstractAffineNormalKernel
affine_map::A
covariance::B
end
so that anyone can construct an AffineNormalKernel
by
AffineNormalKernel(affine_map, my_special_cov)
AffineNormalKernel
should be AffineHomoskedasticNormalKernel
? It is certainly more informative but also very long. HomoskedasticNormalKernel
is a useful type in its own right for e.g. implementing cubature methods in downstream applications. This is naturally achieved by option 2 and can be accomplished by option 1 via the type alias HomoskedasticNormalKernel{A,B} = NormalKernel{A,B} where {A,B<:ConstantConditionalCovariance}
. However, this is not automatically achieved by option 3. cov
for kernels of constant conditional covariance is implemented by an anonymous function like so: cov(K::AffineNormalKernel) = x -> K.Σ
. The wrapping approach of option 1 avoids this. This can of course be achieved by implementing this wrapper for option 2 as well. Is there any benefit to this? normalkernel
which can be used to send the inputs to the correct constructor? For example, if option 1 is taken then
normalkernel(F::AbstractAffineMap, C::Cholesky) = NormalKernel(F, ConstantConditionalCovariance(C))
, or for example if option 2 is taken then normalkernel(F::AbstractAffineMap, C::Cholesky) = HomoskedasticNormalkernel(F, C)
. In this case I guess similar constructors for other kernels/distributions should be defined as well, e.g. normal
, dirac
, dirackernel
. AbstractAffineMap
for the conditional mean parameter. However, as anyone can subtype AbstractAffineMap
I do not see how this would solve a practical problem. In view of 2. and 4. I think option 3 seems least favourable. I am currently favouring option 2 but cant really say anything against option 1 as of now. Furthermore, 4. seems to make sense? (at least it saves a couple of shift-presses...)
A very simple alternative would be to not restrict the covariance part in AffineNormalKernel
at all, and do
const AffineNormalKernel{T} = NormalKernel{<:AbstractAffineMap{T}} where {T}
But probably there is some issue with this that I'm not aware of?
Also I am wondering, in the current state the docs say that NormalKernel
assumes the covariance part to be callable. Is this also the case for the current AffineNormalKernel
s?
Otherwise, I don't have a clear favorite. Maybe it makes sense to come up with some examples first to see how they would actually feel when using them?
A very simple alternative would be to not restrict the covariance part in
AffineNormalKernel
at all, and doconst AffineNormalKernel{T} = NormalKernel{<:AbstractAffineMap{T}} where {T}
But probably there is some issue with this that I'm not aware of?
The problem is that we need the possibility to express "Normal kernel with affine conditional mean and constant conditional covariance" as a type. This can of course not be done by just restricting the conditional mean type and we can not restrict the conditional covariance type because there is no commonly accepted abstract type that all covariances types can subtype (it also seems a bit impossible to achieve).
Also I am wondering, in the current state the docs say that
NormalKernel
assumes the covariance part to be callable. Is this also the case for the currentAffineNormalKernel
s?
In general a normal kernel is of the form $\mathcal{N}(\mu(x), Q(x))$ so the idea is that mean
and cov
should return things that can evaluate $\mu$ and $Q$ at some point $x$. The internal representation of the covariance is then obtained by covp
which does not need to be callable.
Otherwise, I don't have a clear favorite. Maybe it makes sense to come up with some examples first to see how they would actually feel when using them?
I agree.
Redefine CovarianceParameter
to:
const CovarianceParameter{T} = Union{AbstractMatrix{T}, Factorization{T}, T, Base.RefVale{T}} where {T<:Number}
Base.RefVale
is just there in case we want in-place operations on scalar distributions / kernels. CovarianceParameter
. The only exception I can think of is we want an scalar analogue of Cholesky
(i.e. ScalarFactorization
) in which case it might not make sense to subtype Factorization
and it might be a lot of useless work to make sure it is actually a subtype of Number
(we would only need a tiny fraction of the methods we're expected to implement).NormalKernel
is constructible for custom types (i.e. no pre-processing that leads to stackoverflow and so on). This seems like the absolutely simplest solution and I can not think of any immediate drawbacks? (other than the tiny annoyance already mentioned).
Define:
abstract type PSDness end
struct IsPSDParametrization end
struct IsNotPSDParametrization end
ispsdparametrization(A) = IsNotPSDParametrization()
ispsdparametrization(::Cholesky) = IsPSDParametrization()
ispsdparametrization(::SelfAdjoint) = IsPSDParametrization()
# etc...
Add type parameters to e.g. NormalKernel
like so:
struct NormalKernel{TP<:PSDness, TM, TC} <: AbstractNormalKernel
mean::TM
covp::TC
end
Add aliases:
const HomoskedasticNormalKernel{TM,TC} = NormalKernel{<:IsPSDParametrization,TM,TC} where {TM,TC}
const AffineHomoskedasticNormalKernel{TM,TC} = NormalKernel{<:IsPSDParametrization,TM,TC} where {TM<:AbstractAffineMap,TC}
ispsdparametrization(::MyPSDType) = IsPSDParametrization()
marginalize
, bayes_rule
, invert
, etc will "just work" for AffineHomoskedasticNormalKernel{TM,TC}
as long as the type TC
has all the methods expected of a psd parametrization. ispsdparametrization(::SelfAdjoint) = IsPSDParametrization()
is as always awkward as self adjoint matrices do not need to be positive semi-definite but then again, positive semi-definiteness for a general matrix can not be inferred from the type. So that problem always exists.
Currently it is defined as the following type parameter restriction:
However, since there is limited possibilities for a custom covariance type to opt into being a
CovarianceParameter
this means more methods need to be added forNormalKernel{<:AbstractAffineMap,<:MyCovarianceType}
in order to get functionality such asmarginalize
andbayes_rule
However, this would not be necessary if we do the following:
Now all operations on combinations of
Dirac
,DiracKernel
,Normal
,AffineNormalKernel
"just work" as long as their supplied with vectors that behave asAbstractVector
, affine maps that behave likeAbstractAffineMap
, and covariances that behaveCovarianceParameter
(in the sense that they implement the necessary methods).