Can we truely get the joint distribution P(x,y) to calculate the H(x,y) ?

paulbrodersen / entropy_estimators

Estimators for the entropy and other information theoretic quantities of continuous distributions

GNU General Public License v3.0

132 stars 26 forks source link

Can we truely get the joint distribution P(x,y) to calculate the H(x,y) ? #16

Closed singledoggy closed 1 year ago

singledoggy commented 1 year ago

For exmple in

import numpy as np
from sklearn.feature_selection import mutual_info_regression
from entropy_estimators import continuous

np.random.seed(1)
x = np.random.standard_normal(10000)
y = x + 0.1 * np.random.standard_normal(x.shape)

How can H(X,Y) equals to the following?

hxy = continuous.get_h(np.c_[x, y], k=3)

The H(X,Y) is from a 10000*10000 Dimension vector, and you can't just get a joint distribution from marginal distributions, so typically you have to assume it is a multivarible normal distibution, and that's what you do when marginal distributions is normal. But if the marginal distribution is not normal, what can we assume? Or how can we create a norm joint distribution from 2 non-normal margina distribution?

singledoggy commented 1 year ago

As we can see in this wiki, if we only know p(x) and p(y) how can we know if they are independent or not?

singledoggy commented 1 year ago

Also I can generate a test.txt show a demo of negative MI returned by this approach.

import numpy as np
from entropy_estimators import continuous
a = np.loadtxt('./data/test.txt')
print(continuous.get_mi(a[0], a[1]))

This returns a negative MI of -0.5776721165726819, it's unacceptable. Another interesting thing is the result from sklearner is positive 0.73535408, there must be something wrong

from sklearn.feature_selection import mutual_info_regression
sk_res = mutual_info_regression(a[0].reshape(-1, 1), a[1].reshape(-1,),
                                discrete_features=False)
print(sk_res)

paulbrodersen commented 1 year ago

The H(X,Y) is from a 10000*10000 Dimension vector...

If x and y are 10000 samples from 1D normal distributions, then the joint probability is 2D, not 10000 x 10000.

singledoggy commented 1 year ago

Thanks for your answer.