JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.12k stars 419 forks source link

incongruency in `params` of MvNormalCanon vs NormalCanon #1380

Open st-- opened 3 years ago

st-- commented 3 years ago

In contrast to the (co)variance vs standard deviation inconsistency between Normal and MvNormal (https://github.com/JuliaStats/Distributions.jl/issues/584), NormalCanon and MvNormalCanon both are defined in terms of the natural parameters, \eta = \mu \sigma^(-2) and \lambda = \sigma^(-2) in the univariate case and h = inv(\Sigma) \mu and J = inv(\Sigma) in the multivariate case. Both structs also store the mean \mu separately (I'm assuming for computational efficiency reasons?). So all in all I would expect them to behave similarly (except for dimensionality).

But params(NormalCanon(nat1, nat2)) == (nat1, nat2) whereas params(MvNormalCanon([nat1], [nat2])) != ([nat1], [nat2]) -- params(::MvNormalCanon) also returns the mean as the first element in the tuple.

What reasons (other than potential backwards-compatibility) would there be for this?

Would you be willing to have this change at the next breaking release? (Happy to make a PR to change it.)

devmotion commented 3 years ago

I would support a PR, I think, since in my opinion it is a bit strange to return a partly redundant set of parameters. And if one is actually interested in the mean, one can call mean.

I assume one motivation for the current definition if it was done on purpose could be that actually for MvNormalCanon a constructor exists that takes all three "parameters", including the mean. Even though this is inconsistent with the scalar case, I think this inconsistency does not necessarily have to be changed and one has to be a bit careful with the consistency argument: it is more expensive to recompute the mean in the multivariate case, so there are efficiency gains if one can provide a pre-computed mean; in the scalar case it is cheap to calculate the mean.