Open coschroeder opened 4 years ago
First of all: This is awesome! I'd love to add it to the library. If you are interested, that is.
As for the implementation.
If you haven't guessed it already, the _cdf
name is lying a bit. It isn't cumulative distribution function, but a forward Rosenblatt. In the case of a multivariate Gaussian distribution, this can be achieved using the inverse of a Cholesky decomposition. But this is a solution specific for Gaussian, which does not hold for other distribution. Other distributions we must have to do the full probability decomposition. But it should be quite possible though.
To exemplify how to approach the problem, below is an theoretical calculations that needs in 3-dimensional. I don't have the full answer how that will look in practice, but I imagine that by tinkering a little bit, it should be possible to get the form on something both computable and generalizable in multiple dimensions.
PDF:
p(x, y, z) = w0 p0(x, y, z) + w1 p1(x, y, z) + w2 p2(x, y, z)
CDF:
P(x, y, z) = w0 P0(x, y, z) + w1 P1(x, y, z) + w2 P2(x, y, z)
P0 ~ N(mu[0], sigma[0])
P1 ~ N(mu[1], sigma[1])
P2 ~ N(mu[2], sigma[2])
p(x, y) = int p(x, y, z) dz
= w0 r0(x, y) + w1 r1(x, y) + w2 r2(x, y)
P(x, y) = int_x int_y p(x, y) dx dy
= w0 R0(x, y) + w1 R1(x, y) + w2 R2(x, y)
R0 ~ N(mu[0][:2], sigma[0][:2, :2])
R1 ~ N(mu[1][:2], sigma[1][:2, :2])
R2 ~ N(mu[2][:2], sigma[2][:2, :2])
p(x) = int int p(x, y, z) dy dz
= w0 q0(x) + w1 q1(x) + w2 q2(x)
P(x) = int_x p(x) dx
= w0 Q0(x) + w1 Q1(x) + w2 Q2(x)
Q0 ~ N(mu[0][0], sigma[0][0, 0])
Q1 ~ N(mu[1][0], sigma[1][0, 0])
Q0 ~ N(mu[2][0], sigma[2][0, 0])
p(y | x) = p(x, y) / p(x)
P(y | x) = int_y p(x, y) / p(x) dy
= (w0 int_y r0(x, y) dy + w1 int_y r1(x, y) dy + w2 int_y r2(x, y) dy)
/ (w0 q0(x) + w1 q1(x) + w2 q2(x))
p(z | x, y) = p(x, y, z) / p(x, y)
P(z | x, y) = int_z p(x, y, z) / p(x, y) dz
= (w0 int_z p0(x, y, z) + w1 int_z p1(x, y, z) + w2 int_z p2(x, y, z))
/ (w0 r0(x, y) + w1 r1(x, y) + w2 r2(x, y))
output = [P(x), P(y | x), P(z | x, y)]
Please let me know, if something is unclear.
UPDATED
Thanks for your fast response and your interest.
Yes, I was aware that the _cdf is not the cumulative density, but I didn't know how to deal with it exactly.
I tried to implement your description but it is not working yet. Maybe you can give me again some input on the following:
int_y r0(x, y) dy
. with the use of marginals and conditional distribution. This should be faster then the scp.integrate bit.
from chaospy import Dist
from scipy import special
import numpy as np
import numpy
import scipy as scp
import matplotlib.pyplot as plt
import seaborn as sns
"""
Helper functions for conditional Normal distributions
"""
def get_cond_loc(loc, scale, i, xb):
"""
i: conditioning on the first i params (param_a[i:]|param_b[0:i]) p(a|b)
[Bishop: page 87]
"""
assert len(xb) == i
Caa = scale[i:,i:]
Cbb = scale[0:i,0:i]
Cab = scale[i:,0:i]
locb = loc[:i]
loca = loc[i:]
Cbb_inv = np.linalg.inv(Cbb) #np.linalg.inv(np.linalg.cholesky(Cbb))
loca_b = loca + np.dot(np.dot(Cab, Cbb_inv),(xb-locb))
return loca_b
def get_cond_cov(scale,i):
"""
i: conditioning on the first i params (param_a[i:]|param_b[0:i])
[Bishop: page 87]
"""
Caa = scale[i:,i:]
Cbb = scale[0:i,0:i]
Cab = scale[i:,0:i]
Cbb_inv = np.linalg.inv(Cbb)#(np.linalg.cholesky(Cbb))
Ca_b = Caa - np.dot(Cab, np.dot(Cbb_inv, Cab.T))
return Ca_b
"""
Define MoG class
"""
class MoG(Dist):
def __init__(self, loc=[[-5, -5],[5,5]], scale=[[[1, .9], [.9, 1]], [[1, -.9], [-.9, 1]]], a=[0.7,0.3]):
loc = numpy.asfarray(loc)
scale = numpy.asfarray(scale)
a = numpy.asfarray(a)
assert np.isclose(np.sum(a),1)
assert len(loc) == len(scale)
self._repr = {"loc": loc.tolist(), "scale": scale.tolist()}
C = np.array([numpy.linalg.cholesky(scale[i]) for i in range(len(a))])
Ci = np.array([ numpy.linalg.inv(C[i]) for i in range(len(a))])
#m_cond = get_cond_m(loc)
Dist.__init__(self, C=C, Ci=Ci, loc=loc, scale=scale, a=a)
def _cdf(self, x, C, Ci, loc, scale, a):
N = len(loc[0]) # dimension
P = np.zeros(N) # weighted sum cdf per conditional
p = np.ones(N-1) # weighted sum pdf per conditional
# iterating over all dimensions
for i in range(0,N):
Q = np.zeros(len(a))
q = np.zeros(len(a))
# iterating over components
for comp in range(len(a)):
# first part (pure weighted marginal cdf)
if i==0:
Q[comp] = scp.stats.multivariate_normal(loc[comp][0],scale[comp][0,0]).cdf(x[0])
# "higher" conditionals
else:
'''
# define function to integrate int_y r(x,y) dy
fun = lambda z: scp.stats.multivariate_normal(loc[comp][:i+1],scale[comp][:i+1,:i+1]).pdf(np.append(x[:i],z))
# specify integration boundary
bound = -7.5*scale[comp][i,i]+loc[comp][i]
Q[comp] = scp.integrate.quad(fun,bound,x[i])[0]
'''
# NEW: use conditional distributions: int_y r(x,y) dy = r(x) int_y r(y|x) dy
cond_loc = get_cond_loc(loc[comp], scale[comp], i, x[:i])
cond_scale = get_cond_cov(scale[comp],i)
Q[comp] = (scp.stats.multivariate_normal(loc[comp][:i],scale[comp][:i,:i]).pdf(x[:i])
* scp.stats.multivariate_normal(cond_loc, cond_scale).cdf(x[i]) )
# evaluate these higher conditionals. needed for normalization in next round
if i<N-1:
q[comp] = scp.stats.multivariate_normal(loc[comp][:i+1],scale[comp][:i+1,:i+1]).pdf(x[:i+1])
if i <N-1:
p[i] = np.sum(a*q)
P[i] = np.sum(a*Q)/p[i-1] # ok also for first entry, since p[-1]=1
return np.reshape(P, np.shape(x))
def _bnd(self, x, C, Ci, loc, scale, a):
"""
take minimum and maximum of components
"""
scale = [numpy.sqrt(numpy.diag(numpy.dot(C[i],C[i].T))) for i in range(len(a))]
lo_all = np.array([-7.5*scale[i]+loc[i] for i in range(len(a))])
up_all = np.array([7.5*scale[i]+loc[i] for i in range(len(a))])
lo,up = numpy.zeros((2,)+x.shape)
lo.T[:] = np.min(lo_all, axis=0) #(-7.5*scale+loc[0])
up.T[:] = np.max(up_all, axis=0) #(7.5*scale+loc[0])
return lo, up
def __len__(self):
return len(self.prm["C"])
# create samples with cp
loc=[[-1, -5],[1,2]]
scale=[[[1, 0.9], [0.9, 1]], [[1, 0], [0, 1]]]
a=[0.5,0.5]
R = MoG(loc=loc, scale=scale, a=a)
# shapes not yet correct. only sampl(1) works
samples = np.zeros((10,2))
for i in range(10):
temp = R.sample(1)
samples[i,0] = temp[0][0]
samples[i,1] =temp[1][0]
Thanks a lot!
Great. I'll take a look at it closer to the weekend. I'll give you feedback after that.
After getting into the nitty-gritty of (2), I realized that I made that part more complicated than I had to.
s0(x) = int_y r0(x, y) dy
S0 ~ N(mu[0][0], sigma[0][0, 0])
In other words, just take the an appropriate slice of mu and sigma.
Not sure why I made it more complicated than it needed to by.
3)
The shape of input and output should be the same. Also len(x) == len(self)
should hold.
ad (2): Not sure which one is correct. in v1 you would take the marginal over x but integrate p(y|x) up to the specified y. in your last comment it would not depend on y at all. So I am just not sure which term is needed for the Rosenblatt transformation.
Ah right, sorry, too quick on the trigger finger. You are right. It is not a full integration, but a partial one. Gritty details is still required, unfortunatly.
Second look: Looks like we need to evaluate on a truncated multivariate Gaussian distribution I can make this, but I will need a little time to get it in place. I'll get to it before the end of the year. I'll make the MoG distribution as well when I am at it, as the truncated distribution should make the job a lot easier to create MoG.
Unfortunately, the task is bigger than I first realized, and my limited time availability has run out. It should still be possible to do, but not right now. I have added the task to my long term goals, as I do want this feature as well.
Hopefully I will get more time to work on this soon.
Hi, I am struggling with creating a mixture of Gaussian distribution. I tried out several ways but none of it seemed to be correct. Here is one possibility. But I think I am missing something, maybe you can help out?
And creating samples: