filtron / MarkovKernels.jl

Marginal distributions and Markov kernels that play nice with each other for the purpose of Bayesian state estimation.
MIT License
17 stars 2 forks source link

Should ```AffineNormalKernel``` be its own type rather than a type parameter restriction? #106

Open filtron opened 2 months ago

filtron commented 2 months ago

Currently it is defined as the following type parameter restriction:

const AffineNormalKernel{T} =
    NormalKernel{<:AbstractAffineMap{T},<:CovarianceParameter{T}} where {T}

However, since there is limited possibilities for a custom covariance type to opt into being a CovarianceParameter this means more methods need to be added for NormalKernel{<:AbstractAffineMap,<:MyCovarianceType} in order to get functionality such as marginalize and bayes_rule

However, this would not be necessary if we do the following:

abstract type AbstractAffineNormalKernel <: AbstractNormalKernel end

struct AffineNormalKernel{A,B} <: AbstractAffineNormalKernel 
affine_map::A
covariance::B
end

Now all operations on combinations of Dirac, DiracKernel, Normal, AffineNormalKernel "just work" as long as their supplied with vectors that behave as AbstractVector, affine maps that behave like AbstractAffineMap, and covariances that behave CovarianceParameter (in the sense that they implement the necessary methods).

filtron commented 2 months ago

Upon some further consideration I have come up with some alternative options. Now it is important to keep in mind that the problem is that we do not own a type for positive semi-definite matrices with the following properties:

A lot of methods are defined for AffineNormalKernel such as marginalize, invert, bayes_rule which is defined by the following type restriction

const AffineNormalKernel{T} =
    NormalKernel{<:AbstractAffineMap{T},<:CovarianceParameter{T}} where {T}

The consequence is that these methods dont exist for

NormalKernel{<:AbstractAffineMap,<:MyCovarianceType} 

even when MyCovarianceType implements everything that is expected of a covariance parameter.

Option 1: ConstantConditionalCovariance

Add the following type (name could be something different)

struct ConstantConditionalCovariance{A}
covariance_parameter::A 
end 

(C::ConstantConditionalCovariance)(x) = C.covariance_parameter 

so that AffineNormalKernel may be defined by

const AffineNormalKernel{A,B} = NormalKernel{A,B} where {A<:AbstractAffineMap,B<:ConstantConditionalCovariance}

Now anyone can construct an AffineNormalKernel by

NormalKernel(affine_map, ConstantConditionalCovariance(my_special_cov))

Option 2: HomoskedasticNormalKernel

Add the following type

struct HomoskedasticNormalKernel{A,B} <: AbstractNormalKernel
mean_parameter::A 
cov_parameter::B
end 

so that AffineNormalKernel may be defined by

const AffineNormalkernel{A,B} = HomoskedasticNormalKernel{A,B} where {A<:AbstractAffineMap,B}

Now anyone can construct an AffineNormalKernel by

HomoskedasticNormalKernel(affine_map, my_special_cov)

Option 3: AffineNormalKernel

Add the following type

struct AffineNormalKernel{A,B} <: AbstractAffineNormalKernel 
affine_map::A
covariance::B
end

so that anyone can construct an AffineNormalKernel by

AffineNormalKernel(affine_map, my_special_cov)

Some consideration

  1. Maybe what is currently AffineNormalKernel should be AffineHomoskedasticNormalKernel? It is certainly more informative but also very long.
  2. HomoskedasticNormalKernel is a useful type in its own right for e.g. implementing cubature methods in downstream applications. This is naturally achieved by option 2 and can be accomplished by option 1 via the type alias HomoskedasticNormalKernel{A,B} = NormalKernel{A,B} where {A,B<:ConstantConditionalCovariance}. However, this is not automatically achieved by option 3.
  3. Currently cov for kernels of constant conditional covariance is implemented by an anonymous function like so: cov(K::AffineNormalKernel) = x -> K.Σ. The wrapping approach of option 1 avoids this. This can of course be achieved by implementing this wrapper for option 2 as well. Is there any benefit to this?
  4. Should there be a function normalkernel which can be used to send the inputs to the correct constructor? For example, if option 1 is taken then normalkernel(F::AbstractAffineMap, C::Cholesky) = NormalKernel(F, ConstantConditionalCovariance(C)), or for example if option 2 is taken then normalkernel(F::AbstractAffineMap, C::Cholesky) = HomoskedasticNormalkernel(F, C). In this case I guess similar constructors for other kernels/distributions should be defined as well, e.g. normal, dirac, dirackernel.
  5. Option 3 has the "benefit" that you dont need to subtype AbstractAffineMap for the conditional mean parameter. However, as anyone can subtype AbstractAffineMap I do not see how this would solve a practical problem.

In view of 2. and 4. I think option 3 seems least favourable. I am currently favouring option 2 but cant really say anything against option 1 as of now. Furthermore, 4. seems to make sense? (at least it saves a couple of shift-presses...)

nathanaelbosch commented 2 months ago

A very simple alternative would be to not restrict the covariance part in AffineNormalKernel at all, and do

const AffineNormalKernel{T} = NormalKernel{<:AbstractAffineMap{T}} where {T}

But probably there is some issue with this that I'm not aware of?

Also I am wondering, in the current state the docs say that NormalKernel assumes the covariance part to be callable. Is this also the case for the current AffineNormalKernels?

Otherwise, I don't have a clear favorite. Maybe it makes sense to come up with some examples first to see how they would actually feel when using them?

filtron commented 2 months ago

A very simple alternative would be to not restrict the covariance part in AffineNormalKernel at all, and do

const AffineNormalKernel{T} = NormalKernel{<:AbstractAffineMap{T}} where {T}

But probably there is some issue with this that I'm not aware of?

The problem is that we need the possibility to express "Normal kernel with affine conditional mean and constant conditional covariance" as a type. This can of course not be done by just restricting the conditional mean type and we can not restrict the conditional covariance type because there is no commonly accepted abstract type that all covariances types can subtype (it also seems a bit impossible to achieve).

Also I am wondering, in the current state the docs say that NormalKernel assumes the covariance part to be callable. Is this also the case for the current AffineNormalKernels?

In general a normal kernel is of the form $\mathcal{N}(\mu(x), Q(x))$ so the idea is that mean and cov should return things that can evaluate $\mu$ and $Q$ at some point $x$. The internal representation of the covariance is then obtained by covp which does not need to be callable.

Otherwise, I don't have a clear favorite. Maybe it makes sense to come up with some examples first to see how they would actually feel when using them?

I agree.

filtron commented 2 months ago

Option 4:

Redefine CovarianceParameter to:

const CovarianceParameter{T} = Union{AbstractMatrix{T}, Factorization{T}, T, Base.RefVale{T}} where {T<:Number}

This seems like the absolutely simplest solution and I can not think of any immediate drawbacks? (other than the tiny annoyance already mentioned).

Option 5:

Define:

abstract type PSDness end 
struct IsPSDParametrization end 
struct IsNotPSDParametrization end 

ispsdparametrization(A) = IsNotPSDParametrization()
ispsdparametrization(::Cholesky) =  IsPSDParametrization()
ispsdparametrization(::SelfAdjoint) =  IsPSDParametrization()
# etc...

Add type parameters to e.g. NormalKernel like so:

struct NormalKernel{TP<:PSDness, TM, TC} <: AbstractNormalKernel
mean::TM
covp::TC
end

Add aliases:

const HomoskedasticNormalKernel{TM,TC} = NormalKernel{<:IsPSDParametrization,TM,TC} where {TM,TC}
const AffineHomoskedasticNormalKernel{TM,TC} = NormalKernel{<:IsPSDParametrization,TM,TC} where {TM<:AbstractAffineMap,TC}