Closed lindahua closed 10 years ago
This refactoring not only achieve the goal of #231, but also eliminate the dependency on PDMats.jl.
I'v been thinking along similar lines. One of the things I've noticed is that the mat
field containing the Matrix
of a PDMat
is never actually used by any of the methods (other than cov
) and so an alternative option would be to simply make ZeroMeanMvNormal
a parametric type that could accept the factorization
immutable ZeroMeanMvNormal{T <: Union(Matrix,Factorization)} <: AbstractMvNormal
cov::T
end
This would require fewer types, and make it much easier to use alternative Factorization
s, (CholeskyPivoted
, Eigen
, LDLt
, etc), or special Matrix
structures (Diagonal
, SymTridiagonal
) as well as define custom structures for things like Gaussian graphical models and Gaussian processes. The IsotropicNormal
functionality could be provided by a new IsoDiagonal
type.
As there should be essentially no overhead in creating this structure form a given cov
, we could also define the GenericMvNormal{T}
methods by simply calling ZeroMeanMvNormal
, e.g.
pdf(d::GenericMvNormal, x) = pdf(ZeroMeanMvNormal(d.cov), x - d.mean)
The computation is vastly different when the covariance is of different forms (scalar, diagonal, and full matrix). The computation is used in:
https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/mvnormal.jl#L75 https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/mvnormal.jl#L84 https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/mvnormal.jl#L92 https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/mvnormal.jl#L124 https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/mvnormal.jl#L174
and many other lines. They call methods like invquad
, unwhiten
, logdet
etc, of which the computational details are encapsulated in PDMats.jl. These codes invoke methods on d.Σ
, instead of working on its mat
or value
field directly.
In my original design, I require all sub types of AbstractPDMats
to implement methods outlined in https://github.com/JuliaStats/PDMats.jl
That package also already provides some matrix types of special structures (e.g. ScalMat
and PDiagMat
). Since they require a different set of methods than AbstractMatrix
, I chose not to mix them with the array hierarchy in Julia base, and introduce another hierarchy under AbstractPDMat
.
Back to this issue, we have two choices:
immutable ZeroMeanMvNormal{C<:AbstractPDMat}
cov::C
end
AbstractPDMat
altogether and directly implement those methods on sub types of AbstractZeroMeanMvNormal
(like I outline above).Personally, I don't have strong opinion of one way or the other.
@lindahua I meant indirectly as well: the code for invquad
, unwhiten
, logdet
etc. all just use the Σ.chol
field, and so could easily dispatch directly on Cholesky
(quad
uses the mat
field, but this could easily be changed to a Triangular * Vector; sumsq
, which would actually give a lower flop count).
Part of my reasoning is that there are certain cases where you end up working with the factorisation anyway, and you never actually need to compute the full matrix: e.g. in MCMC you typically parametrise the covariance in terms of its Cholesky (or some other representation), and the standard Wishart and inverse Wishart samplers give you a Cholesky (it's on my todo list to add this to our rand
methods).
Anyway, since PDMat <: AbstractMatrix
, defining
immutable ZeroMeanMvNormal{T <: Union(Matrix,Factorization)} <: AbstractMvNormal
cov::T
end
would not preclude the use of the PDMat
type, and this would seem sensible as a default when constructred with mvnormal(X::Matrix)
.
@simonbyrne I see your point.
Here, I would like to explain the rationales why I introduced AbstractPDMat
originally:
AbstractMatrix
or Factorization
is qualified for this purpose.cov(d)
, instead of waiting for a procedure to recompute that matrix from the Cholesky. This motivates the design of PDMat
that maintains both.True, but couldn't we just enforce this at the outer constructor level? i.e. We only define ZeroMeanMvNormal
methods for matrices/factorizations where the construction makes sense (e.g. Diagonal
, Tridiagonal
, Cholesky
, Eigen{Real,Real}
, PDMat
), and leave users to define their own if they want to extend it further.
As I said above, I would be happy for PDMat
to be the default option, I just want to have the choice.
And to answer your original question, I would be in favour of the first option (as you've probably guessed by now).
Sounds good to me.
@lindahua I'm just having a quick look through the current code: is there a reason why GenericMvNormalKnownSigmaStats
store invΣ
instead of Σ
?
A similar design was implemented in #296.
What PDMats.jl currently implements is actually the routines needed to perform computation over zero mean Gaussian distributions.
I am considering migrate the codes in PDMats.jl into this package, and turn those three positive definite matrix types into three different kinds of zero-mean multivariate normal distributions:
Then a full multivariate normal distribution as follows:
In this way, most implementation can be done over
GenericMvNormal
.This design makes sense to me, except that the names sound a little bit long. However, this issue can be partly alleviated using a convenient constructing function
Generally, end users won't need to key in those type names very often.
Thoughts?