Closed ParadaCarleton closed 2 years ago
It's based on an invariance property of Gaussian distributions which does not hold for general elliptical distributions AFAIK (haven't checked; eq. 2 in the paper below). However, the algorithm can be generalized to more general target distributions without Gaussian prior by a construction that involves infinite mixtures of Gaussian, i.e., multivariate t distributions (introduces an additional auxiliary variable that is marginalized out, i.e., dropped from the samples): https://jmlr.org/papers/volume15/nishihara14a/nishihara14a.pdf
I'll close this issue since it seems to be mainly a duplicate of https://github.com/TuringLang/EllipticalSliceSampling.jl/issues/12.
@devmotion I think that linearity property is a defining property of elliptical distributions, although I might be wrong. Or rather, I think that a slightly weaker condition is required than is stated by eq. 2 -- as long as the bivariate distribution is elliptical it might work?
This property (eq. 2) is not a defining property of elliptical distributions.
If X ~ e(m, S)
is a random variable distributed according to an elliptical distribution with location m
and positive definite symmetric matrix S
of size d x d
, then for all matrices D
of size c x d
(c <= d
) of rand c
the random variable Y := DX
is distributed according to Y ~ e(Dm, D'SD)
(see e.g. property 1 in Owen and Rabinovitch's paper). Thus in the general form of equation 2 with X ~ e(m, S)
and Nu ~ e(m, S)
independently distributed we have that the law of Y := (X - m) cos(theta) + (Nu - m) sin(theta) + m
is equal to the law of Z := A + B - m
where A ~ e(0, cos^2(theta) S)
and B ~ e(0, sin^2(theta) S)
. However, in general elliptical distributions are not closed under convolutions, and hence in general the law of A + B
is not e(0, sin^2(theta) S + cos^2(theta) S) = e(0, S)
(which would imply Y ~ e(m, S)
).
More concretely, the property holds for all S
and all theta
if and only if the characteristic function phi_X(t) = f(t' S t) exp(i t' m)
(such a function f
exists as it is defining property of elliptical distributions) of random variable X ~ e(m, S)
satisfies f(x) f(y) = f(x + y)
for all real numbers x
and y
. For instance, this is the case if X
is normally distributed (there we have f(x) = exp(-x/2)
). More generally, we know that f(0) = 1
(since phi_X(0) = 1
and exp(i 0 m) = 1
), and hence the defining property of the exponential function implies that f(x) = exp(r x)
for some real number r
(for normal distributions we have r = -1/2
). I.e., the natural generalization of the property in eq. 2 to X ~ e(m, S)
and Nu ~ e(m, S)
independent rvs holds if and only if phi_X(t) = exp(r t' S t) exp(i t' m)
for some real number r
.
The property in eq. 2 is the crucial part of the algorithm as you can see in eq. 6 in the original paper. It is mandatory that Y = (X - m) * cos(theta) + (X - m) * sin(theta) + mu
is distributed according to the desired prior e(m, S)
.
Got it, thanks! I was confusing it with a different property regarding linear combinations of components from a random sample.
As far as I can tell, there's nothing in this algorithm that makes it impossible to use, say, a multivariate t or multivariate logistic prior; could the code be generalized to handle this?