ELIFE-ASU / PyInform

A Python Wrapper for the Inform Information Analysis Library
https://elife-asu.github.io/PyInform
MIT License
49 stars 9 forks source link

Continuous values for Transfer Entropy #36

Open EricJoung1997 opened 2 years ago

EricJoung1997 commented 2 years ago

Thanks for the cool library!

I find there are binary values for Transfer Entropy input. such as : xs = [0,1,1,1,1,0,0,0,0] ys = [0,0,1,1,1,1,0,0,0]

Is it possible to use continuous values instead of binary values, such as : xs = [0.5,1.7,2.5,0.1,1.5,0.9,0.56,0.71,2.10] ys = [0.3,1.0,3.1,1.1,0.1,1.8,0.5,1.2,3.5]

QunShanHe commented 3 months ago

Thanks for the cool library!

I find there are binary values for Transfer Entropy input. such as : xs = [0,1,1,1,1,0,0,0,0] ys = [0,0,1,1,1,1,0,0,0]

Is it possible to use continuous values instead of binary values, such as : xs = [0.5,1.7,2.5,0.1,1.5,0.9,0.56,0.71,2.10] ys = [0.3,1.0,3.1,1.1,0.1,1.8,0.5,1.2,3.5]

I also have the same question, how did you solve it in the end?

jakehanson commented 3 months ago

Hi there,

Transfer entropy requires a finite number of states to be calculated. This means continuous values must be binned into discrete states before calculating transfer entropy.

The method of binning is somewhat of an art and requires specific knowledge of the problem at hand. For your values, which appear to range continuously between 0 and about 3.5, I might suggest using four integer bins. However, the more bins you use, the more sparsely populated the probability distribution becomes, so you might also consider just two bins.

For example:

xs = [0.5, 1.7, 2.5, 0.1, 1.5, 0.9, 0.56, 0.71, 2.10]
xs_binned = utils.bin_series(xs, b=2)

Then, do the same for ys and proceed with the TE calculation.

All of this is discussed thoroughly in the documentation:

https://elife-asu.github.io/PyInform/utils.html

It's best practice to try various binning techniques and see how sensitive your results are to these choices. If your results vary significantly, that's not a good sign. This problem of state-binning or "coarse-graining" continuous values is ubiquitous throughout information theory, and is a pitfall of many information-theoretic analyses.

Good luck!

Jake

QunShanHe commented 3 months ago

Hello Jake,

Thank you very much for your detailed response to my question.

I have tried to normalize my time series using MinMaxScaler and then multiplied it by a coefficient state_N to discretize the original data. In this scenario, do the discretized values still retain their characteristic of magnitude variation, or do the states become equal and unordered after discretization?

Additionally, how can I determine the optimal value for state_N? Would it be reasonable to assume that the state_N where the transfer entropy is maximized is the most suitable?

Qunshan