Better Timeseries Comparison

theideasmith commented 8 years ago

Using Time Delayed Mutual Information instead of cross correlation. http://arxiv.org/pdf/1110.4102v1.pdf

theideasmith commented 8 years ago

see this: http://www.stat.berkeley.edu/~binyu/summer08/L2P2.pdf

and some slightly modified mutual information code from stackoverflow (it works, why change it?)

def mutual_information(X,Y,bins):
    # See this page on wikipedia: https://en.wikipedia.org/wiki/Mutual_information
    c_XY = np.histogram2d(X,Y,bins)[0]
    c_X = np.histogram(X,bins)[0]
    c_Y = np.histogram(Y,bins)[0]

    H_X = shan_entropy(c_X)
    H_Y = shan_entropy(c_Y)
    H_XY = shan_entropy(c_XY)

    MI = H_X + H_Y - H_XY
    return MI

def shan_entropy(c):
    c_normalized = c / float(np.sum(c))
    c_normalized = c_normalized[np.nonzero(c_normalized)]
    H = -sum(c_normalized* np.log2(c_normalized))  
    return H

There are two issues I know of with mutual information, only one of which I think is relevant. The first is that MI is computationally expensive because you need to actually calculate your PDF for all variables as well as a joint PDF. The second is that the results are highly dependent on the chosen binning parameters. I know there has been some work by Liam Paninsky on this, but before adding MI to analysis.py we should definitely take some time to think about these issues.

theideasmith commented 8 years ago

I realize this is becoming somewhat a series of my personal random thoughts on this matter – so let it be.

I was thinking we should not only compare topological and functional clusters, but also functional connectivity and static connectivity. Something to keep in mind here (I think) is that functional/statistical connectivity may at first appear different than structural connectivity but then if you take into account physiological details (such as type of neurotransmitter, etc.), it may make sense. Also, we can also see if information flow trajectories as predicted by static connectome match functional information flow pipelines (with a delayed cross correlated / delayed mutual information analysis).

lukeczapla commented 8 years ago

Yes, I think taking account the neurotransmitter, for instance, may give a lot more insight. It may be missing something if it doesn't look at the relationship between neurotransmitters, network connectivity, and function. I was following what you described regarding mutual information, and calculating joint PDFs can be a difficult task - whether in the context of analyzing known data or (even more difficultly) in analyzing models. The answer to some questions themselves can be found in that kind of analysis. When there is a multidimensional space it could be understood by a single parameter or a handful of parameters, rather than the whole space itself. Multidimensional spaces are complicated alone. By the way, is there a meeting for tomorrow on chat?

Uiuran commented 5 years ago

see this: http://www.stat.berkeley.edu/~binyu/summer08/L2P2.pdf

and some slightly modified mutual information code from stackoverflow (it works, why change it?)
def mutual_information(X,Y,bins):
    # See this page on wikipedia: https://en.wikipedia.org/wiki/Mutual_information
    c_XY = np.histogram2d(X,Y,bins)[0]
    c_X = np.histogram(X,bins)[0]
    c_Y = np.histogram(Y,bins)[0]

    H_X = shan_entropy(c_X)
    H_Y = shan_entropy(c_Y)
    H_XY = shan_entropy(c_XY)

    MI = H_X + H_Y - H_XY
    return MI

def shan_entropy(c):
    c_normalized = c / float(np.sum(c))
    c_normalized = c_normalized[np.nonzero(c_normalized)]
    H = -sum(c_normalized* np.log2(c_normalized))  
    return H
There are two issues I know of with mutual information, only one of which I think is relevant. The first is that MI is computationally expensive because you need to actually calculate your PDF for all variables as well as a joint PDF. The second is that the results are highly dependent on the chosen binning parameters. I know there has been some work by Liam Paninsky on this, but before adding MI to analysis.py we should definitely take some time to think about these issues.

Is there a trustable python TDMI code ? Isnt better to direct code from the paper or translate the matlab version ? histogram is too noisy for a measure aimed on sensitivity ...

Uiuran commented 5 years ago

Is possible to use knn to estimate tdmi with

https://github.com/jakobrunge/tigramite/issues/36

openworm / neuronal-analysis

Better Timeseries Comparison #15