miicTeam / miic_R_package

Learning causal or non-causal graphical models using information theory
GNU General Public License v3.0
26 stars 3 forks source link

Negative CMI with continuous Zs #101

Closed vcabeli closed 3 years ago

vcabeli commented 3 years ago

We have observed cases where I(X;Y|Z)<0 because of the way we perform the discretizatinos. It happens when we have some continuous Zs and X or Y (or both) discrete.

We compute the conditional mutual information by maximizing each term in the chain rule :

https://github.com/miicTeam/miic_R_package/blob/661649d4bb7d6f7859bc0e54e22bba1d22515d26/src/computation_continuous.cpp#L732-L734

For each term, we fix the cuts on X and Y optimized when computing I(Y;X,Z) and I(X;Y,Z) respectively ; but we re-initialize the cuts for each Zi and re-optimize them for every term of the sum.

Because of this, there are cases where I(X;Z) (red part) is larger than I(X;Y,Z) (red+green).

CMI

As you can see this is not supposed to happen, I suppose this is due to local maximums which are slightly more prone to be found when optimizing on |Z|+1 variables, as in I(X;Y,Z) vs |Z| (as in I(X;Z)).

Note that we can't just re-use the same cutpoints of Z for all terms : as an extreme example, imagine a case where I'(X;Z) = 0, implying 1 single bin for Zs, whereas I'(X;Y,Z) > I'(X;Y) > 0, i.e. some information is contained in the interaction between Y and Z, for which we need multi-bins discretization on Zs.

As a solution we can :