pwollstadt / IDTxl

The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory.
http://pwollstadt.github.io/IDTxl/
GNU General Public License v3.0
249 stars 76 forks source link

Estimating bivariate_te using data frame #72

Closed AviralKumarTiwari closed 3 years ago

AviralKumarTiwari commented 3 years ago

Dear Sir/Madam, Can you please provide a demo showcasing the application of proposed methods using data frame from Panda? When I try I am getting error. Also not able to see the structure of the data which is generated in your demos using following lines:

a) Generate test data

data = Data() data.generate_mute_data(n_samples=1000, n_replications=5)

It would be great for new users to showcase the application using realtime data.

import pandas as pd

np.random.seed(1945)

x = np.random.randint(0, 2, size=1000)

y = np.random.randint(0, 2, size=1000)

data = pd.read_excel (r'C:\ACC.xlsx', sheet_name='Sheet1',parse_dates=['date'], index_col='date') data.head()

pwollstadt commented 3 years ago

Hi @AviralKumarTiwari, what error do you get exactly?

IDTxl does accept pandas data frames as input to the Data class, which goes into the network inference algorithms (you can't pass a pandas data frame directly though). Here is a modified version of the bivariate TE demo script (it also shows how to access the attribute holding the actual data within the Data object through data._data):

# Import classes
from idtxl.bivariate_te import BivariateTE
from idtxl.data import Data
from idtxl.visualise_graph import plot_network
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# a) Generate test data
cov = 0.4
lag = 1
n = 1000
source = np.random.randn(n+lag)
target = cov * source + (1-cov) * np.random.randn(n+lag)
np.random.seed(42)
df = pd.DataFrame({
    'source': source[lag:],
    'target': target[:-lag]
})
print(df.shape)
print(df.head())
data = Data(df, dim_order='sp')
print(data._data.shape)

# b) Initialise analysis object and define settings
network_analysis = BivariateTE()
settings = {'cmi_estimator': 'JidtGaussianCMI',
            'max_lag_sources': 5,
            'min_lag_sources': 1}

# c) Run analysis
results = network_analysis.analyse_network(settings=settings, data=data)

# d) Plot inferred network to console and via matplotlib
results.print_edge_list(weights='max_te_lag', fdr=False)
plot_network(results=results, weights='max_te_lag', fdr=False)
plt.show()
AviralKumarTiwari commented 3 years ago

Many thanks dear Prof. for this help. I was able to work with the example that you had provided.

Kind regards Aviral

On Tue, Jul 27, 2021 at 1:55 PM Patricia Wollstadt @.***> wrote:

Hi @AviralKumarTiwari https://github.com/AviralKumarTiwari, what error do you get exactly?

IDTxl does accept pandas data frames as input to the Data class, which goes into the network inference algorithms (you can't pass a pandas data frame directly though). Here is a modified version of the bivariate TE demo script (it also shows how to access the attribute holding the actual data within the Data object through data._data):

Import classesfrom idtxl.bivariate_te import BivariateTEfrom idtxl.data import Datafrom idtxl.visualise_graph import plot_networkimport matplotlib.pyplot as pltimport pandas as pdimport numpy as np

a) Generate test datacov = 0.4lag = 1n = 1000source = np.random.randn(n+lag)target = cov source + (1-cov) np.random.randn(n+lag)np.random.seed(42)df = pd.DataFrame({

'source': source[lag:],
'target': target[:-lag]

})print(df.shape)print(df.head())data = Data(df, dim_order='sp')print(data._data.shape)

b) Initialise analysis object and define settingsnetwork_analysis = BivariateTE()settings = {'cmi_estimator': 'JidtGaussianCMI',

        'max_lag_sources': 5,
        'min_lag_sources': 1}

c) Run analysisresults = network_analysis.analyse_network(settings=settings, data=data)

d) Plot inferred network to console and via matplotlibresults.print_edge_list(weights='max_te_lag', fdr=False)plot_network(results=results, weights='max_te_lag', fdr=False)plt.show()

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pwollstadt/IDTxl/issues/72#issuecomment-887316351, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE2QF3TCQLITSNGB7FOH3P3TZZUQRANCNFSM5BALI2EQ .