Closed singledoggy closed 1 year ago
As we can see in this wiki, if we only know p(x) and p(y) how can we know if they are independent or not?
Also I can generate a test.txt show a demo of negative MI returned by this approach.
import numpy as np
from entropy_estimators import continuous
a = np.loadtxt('./data/test.txt')
print(continuous.get_mi(a[0], a[1]))
This returns a negative MI of -0.5776721165726819
, it's unacceptable.
Another interesting thing is the result from sklearner is positive 0.73535408
, there must be something wrong
from sklearn.feature_selection import mutual_info_regression
sk_res = mutual_info_regression(a[0].reshape(-1, 1), a[1].reshape(-1,),
discrete_features=False)
print(sk_res)
The H(X,Y) is from a 10000*10000 Dimension vector...
If x
and y
are 10000 samples from 1D normal distributions, then the joint probability is 2D, not 10000 x 10000.
Thanks for your answer.
For exmple in
How can
H(X,Y)
equals to the following?The H(X,Y) is from a 10000*10000 Dimension vector, and you can't just get a joint distribution from marginal distributions, so typically you have to assume it is a multivarible normal distibution, and that's what you do when marginal distributions is normal. But if the marginal distribution is not normal, what can we assume? Or how can we create a norm joint distribution from 2 non-normal margina distribution?