JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.08k stars 408 forks source link

why do cov and invcov call full? #572

Open tpapp opened 7 years ago

tpapp commented 7 years ago

Internally many multivariate distributions represent the scale as a subtype of PDMats.AbstractPDMat, yet cov and invcov call full before returning it. This seems premature, eg suppose the user wants to do an eigendecomposition before some transformation, then it has to be converted back, or reconstructed from the parameters.

andreasnoack commented 7 years ago

I wasnøt aware of that. Returning a PDMat seems fine so feel free to make a PR.

johnmyleswhite commented 7 years ago

Is that the only place we'd return a PDMat? I'm not opposed, but I'm wondering how it affects consistency.

jmxpearson commented 7 years ago

The only surprise here might be that PDMat <: AbstractArray is false, so doing most of the non-arithmetic array operations would require a conversion.

Why AbstractPDMat isn't a subtype of AbstractArray is something I never got, though.

andreasnoack commented 7 years ago

@jmxpearson Good point. I wasn't aware of that. I think we should try to change that and only return PDMats if/when that is the case.

tpapp commented 7 years ago

The original problem where this came up is currently implemented as

"""
Given covariance matrix Σ, shrinks the eigenvalues to their mean by a
factor `shrink` (the transformation is not volume-preserving), the
scake covariance matrix is scaled by

`Σ_scale * 2.4^2/d`

where `Σ_scale=1` yields the value recommended for random-walk
Metropolis.
"""
function scale_Σ(Σ, shrink=0.2, Σ_scale=1.0)
    @assert 0 ≤ shrink ≤ 1
    fact = eigfact(Symmetric(Σ))
    λs = sqrt.(fact[:values])
    λ0 = mean(λs)
    fact[:values] .= λs .* (1-shrink) + λ0*shrink
    d = size(chain)[2]
    PDMat(Symmetric(full(fact) .* (Σ_scale*2.4^2/d)))
end

with the idea that I can do MvNormal(scale_Σ(cov(z))) for some z::MvNormal. Notice the boilerplate: first I have to tell eigfactthat Σ is symmetric, then I have to wrap in PDMat(Symmetric()) because the single-argument MvNormal constructor will only accept Matrix or AbstractPDMat, even though PDMat which is called by the former constructor would accept the Symmetric without a problem. But that is a separate issue.

lindahua commented 7 years ago

I agree that there should be a function that directly returns the internal representation, i.e. PDMat, for the purpose above. However, in other sense, PDMat is an internal representation, and should not always be directly exposed.

What about something like interncov(d) etc?

nalimilan commented 7 years ago

Or cov(PDMat, ...)?

andreasnoack commented 7 years ago

PDMat is an internal representation, and should not always be directly exposed

If PDMat became a subtype of AbstractMatrix then it wouldn't be hard to support general indexing for them so I don't see the harm in using them. In base, we also return XTriangular and Symmetric when possible.

st-- commented 2 years ago

If PDMat became a subtype of AbstractMatrix then it wouldn't be hard to support general indexing for them so I don't see the harm in using them. In base, we also return XTriangular and Symmetric when possible.

This was resolved in PDMats.jl by https://github.com/JuliaStats/PDMats.jl/pull/105, included in the package as of version 0.10, which is already the lower bound in Distributions.jl. I can prepare a PR that fixes this.