JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.11k stars 416 forks source link

Add `loglikelihood` for `MatrixDistribution`s #981

Open nickrobinson251 opened 5 years ago

nickrobinson251 commented 5 years ago

I think it would be something like

function loglikelihood(d::MatrixDistribution, Xs::AbstractVector{<:AbstractMatrix})
    all(==(size(d)), size.(Xs)) || throw(DimensionMismatch("All samples must have size $(size(d))"))
    return sum(X -> _logpdf(d, X), Xs)
end
nickrobinson251 commented 5 years ago

yeah, seems right (at least, using this kinda lazy test for MatrixNormal):

julia> U = let x = rand(3, 3); x*x' + I; end;

julia> V = let x = rand(4, 4); x*x' + I; end;

julia> d = MatrixNormal(rand(3, 4), U, V);

julia> Xs = rand(d, 100);

julia> loglikelihood(d, Xs)
-2340.6870662373026

julia> loglikelihood(vec(d), hcat(vec.(Xs)...))  # go via `MvNormal`
-2340.6870662373026

julia> loglikelihood(d, Xs) == loglikelihood(vec(d), hcat(vec.(Xs)...))
true
johnczito commented 5 years ago

(Admirable use of vec(d))

Seems good to me. At the risk of throwing this discussion completely off topic, your solution is seemingly inherited from the generic logpdf fallback

https://github.com/JuliaStats/Distributions.jl/blob/5b6c2dd271897999c1b50aaa18ed210b0ec6dce7/src/matrixvariates.jl#L153-L157

So before we double down on those conventions, some questions:

nickrobinson251 commented 5 years ago

Why do we just check the matrix dimensions? Shouldn't that be a call to insupport?

Good question. Not sure. Maybe! I basically just copied the multivariate definition: https://github.com/JuliaStats/Distributions.jl/blob/5b6c2dd271897999c1b50aaa18ed210b0ec6dce7/src/multivariates.jl#L259-L267

Is a vector of matrices the "best" way to handle a sample of matrices (as opposed to a 3d array)? I think so, but I wonder what people think about this;

I like it as an API. Haven't thought about "good" computational reasons for it. I does kind of bug me that it is inconsistent with sampling a multivariate distribution which gives a 2d array (rather than a vector of vectors, which i'd prefer, API-wise). Anyway, yes, worth discussion but agreed that this is off-topic here :)

johnczito commented 5 years ago

Yeah. I wonder if we should have a check keyword argument or something to give you the option of a full support check or not. In general, density evaluation for the matrix-variates "ought" to perform a proper support check because these random matrices often live in special spaces (posdef, orthogonal, possemdef with some rank). But if a full support check is computationally demanding (and in context unnecessary because you know the matrix is valid already), you'd want the option of evaluating the density without incurring the cost of the check.

I does kind of bug me that it is inconsistent with sampling a multivariate distribution which gives a 2d array...

Good point.

matbesancon commented 5 years ago

Yeah. I wonder if we should have a check keyword argument or something to give you the option of a full support check or not

In the last version, there is a check_args keyword for constructor, I think it would be alright to have it too for other operations

I does kind of bug me that it is inconsistent with sampling a multivariate distribution which gives a 2d array...

+1