DynamicsAndNeuralSystems / pyspi

Comparative analysis of pairwise interactions in multivariate time series.
https://time-series-features.gitbook.io/pyspi/
GNU General Public License v3.0
198 stars 26 forks source link

Causing Error with Zero Values #68

Closed MarziyehPourmousavi closed 3 months ago

MarziyehPourmousavi commented 3 months ago

I've encountered the error of existance NaN while using the library. It seems that when certain values in the input are exactly zero, the code fails to execute properly, possibly due to a division by zero error.

jmoo2880 commented 3 months ago

Hi @MarziyehPourmousavi, I have not been able to reproduce the issue you mentioned. Could you please provide more details about the problem you're facing? Specifically, I'm interested in knowing whether pyspi is failing to run entirely or if certain SPIs are not computing correctly. If pyspi is not executing at all, it would be helpful to know any error messages you're receiving. If specific SPIs are failing to compute, please let me know which ones and any relevant error messages or unexpected behavior you're observing.

MarziyehPourmousavi commented 3 months ago

Hi @jmoo2880, Thank you for your prompt response. Below, I've provided an example code snippet along with the error message that demonstrates the issue I encountered:

import numpy as np from pyspi.calculator import Calculator

data = np.array([[0,0,0],[1,0,1]]) calc = Calculator(dataset=data, subset='fast') calc.compute()

And here's the error message:

ValueError Traceback (most recent call last) Input In [1], in <cell line: 5>() 2 from pyspi.calculator import Calculator 4 data = np.array([[0,0,0],[1,0,1]]) ----> 5 calc = Calculator(dataset=data, subset='fast') 6 calc.compute()

File ~/opt/anaconda3/lib/python3.8/site-packages/pyspi/calculator.py:121, in Calculator.init(self, dataset, name, labels, subset, configfile) 118 print(f"="*100 + "\n") 120 if dataset is not None: --> 121 self.load_dataset(dataset)

File ~/opt/anaconda3/lib/python3.8/site-packages/pyspi/calculator.py:255, in Calculator.load_dataset(self, dataset) 248 """Load new dataset into existing instance. 249 250 Args: 251 dataset (:class:~pyspi.data.Data, array_list): 252 New dataset to attach to calculator. 253 """ 254 if not isinstance(dataset, Data): --> 255 self._dataset = Data(Data.convert_to_numpy(dataset)) 256 else: 257 self._dataset = dataset

File ~/opt/anaconda3/lib/python3.8/site-packages/pyspi/data.py:69, in Data.init(self, data, dim_order, normalise, name, procnames, n_processes, n_observations) 67 if data is not None: 68 dat = self.convert_to_numpy(data) ---> 69 self.set_data( 70 dat, 71 dim_order=dim_order, 72 name=name, 73 n_processes=n_processes, 74 n_observations=n_observations, 75 ) 77 if procnames is not None: 78 assert len(procnames) == self.n_processes

File ~/opt/anaconda3/lib/python3.8/site-packages/pyspi/data.py:188, in Data.set_data(self, data, dim_order, name, n_processes, n_observations, verbose) 186 nans = np.isnan(data) 187 if nans.any(): --> 188 raise ValueError( 189 f"Dataset {name} contains non-numerics (NaNs) in processes: {np.unique(np.where(nans)[0])}." 190 ) 192 self._data = data 193 self.data_type = type(data[0, 00, 0])

ValueError: Dataset None contains non-numerics (NaNs) in processes: [0].

jmoo2880 commented 3 months ago

By default, pyspi will try to z-score the time series along the time axis before computing SPIs. If one of your processes are constant, as in your example above with all zeros in process 0, this will result in outputting NaNs due to an attempted division by zero.

MarziyehPourmousavi commented 3 months ago

That makes sense, thank you for the explanation!