neuropsychology / NeuroKit

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
https://neuropsychology.github.io/NeuroKit
MIT License
1.59k stars 423 forks source link

Largest Lyapunov Exponent (complexity_lyapunov method) - incorrect result for Lorenz system #900

Open contentsciences opened 1 year ago

contentsciences commented 1 year ago

Overview of Problem I have been trying to get NeuroKit2 to give a suitable result for the Largest Lyapunov Exponent via the complexity_lyapunov method.

Before using it on a set of data such as EEG data, I wanted to validate that it would return valid results for a system for which we can explicity calculate the LE's using other very well tried and tested methods. Therefore you will not be surprised that I used the Lorenz system with the usual sigma (=10.0), rho (=28.0) and beta (=8.0/3.0) values and generated a data set of 10000 points.

Using the 'x' values only from this data set, I used the complexity_delay to calculate the lag (using Fraser1986, Rosenstein1993 options etc), the complexity_dimension to calculate the dimension (using AFN, CD and FNN options) and then pass the optimal values into the complexity_lyapunov method.

We know from other studies and methods that there are 3 LE's for the Lorenz system and the largest of which is ~0.9036. This is the value I was expecting back from the complexity_lyapunov method, however, it does not! It returns a value of 0.02048061575362412. This is signigicantly a long way off 0.9036.

Under the covers it appears that the complexity_lyapunov method is doing a least squares fit of the data to compute the slope (which should give the LLE) but in this case alas it does not appear to be correct.

Furthermore, if I set 'show=True', I was expecting a plot to appear but it doesn't.

To Reproduce 1) Firstly generate the data and store in a file or a data frame directly.

import numpy as np
from scipy.integrate import odeint

def lorenz(length=10000, x0=None, sigma=10.0, beta=8.0/3.0, rho=28.0,
           step=0.001, sample=0.03, discard=1000):

    def _lorenz(x, t):
        return [sigma * (x[1] - x[0]), x[0] * (rho - x[2]) - x[1],
                x[0] * x[1] - beta * x[2]]    

    sample = int(sample / step)
    t = np.linspace(0, (sample * (length + discard)) * step,
                    sample * (length + discard))

    return (t[discard * sample::sample],
            odeint(_lorenz, x0, t)[discard * sample::sample])

sample = 0.01
x0 = [0, 1.05, 1.0]
states = lorenz(length=10000, sample=sample, x0=x0,
                sigma=10.0, beta=8.0/3.0, rho=28.0)[1]

with open('lorenz63-' + str(x0) + '-test-10-28-83.txt', 'w') as lorenz63File:
     lorenz63File.write(
         'x' + ',' + 'y' + ',' + 'z' + '\n')
     for j in range(len(states)):
         lorenz63File.write( str(states[j][0]) + ',' + str(states[j][1]) + ',' + str(states[j][2]) + '\n')`

This will give you a data set that looks like this:

x,y,z
-5.907526053054245,-3.6714585315034545,27.105629274764055
-5.695954132252239,-3.7010066670948722,26.60318210652471
-5.508566695290873,-3.755770945337071,26.109046522966693
-5.34535226304721,-3.8335392518223466,25.625026932111528

(Note: If you plot the states here you get a lovely reproduction of the classic butterfly)

2) Compute the delay (lag a.k.a. Tau), the dimension and then the Largest Lyapunov Exponent

import neurokit2 as nk
import pandas as pd

Pull data from CSV file.
df = pd.read_csv("../GenerateData/lorenz63-[0, 1.05, 1.0]-test-10-28-83.txt")

# Convert df column to array using df.Values
signal_ = df['x'].values

# using Rosenstein 1993 approach - Gives a delay of 20
# delay, parameters = nk.complexity_delay(signal_, delay_max=100, show=True, method="rosenstein1993")
#

# using Fraser 1986 approach (Mutual Information) - Gives a delay of 58
# delay, parameters = nk.complexity_delay(signal_, delay_max=100, show=True, method="fraser1986")

# using Theiler 1990 approach - Gives a delay of 31
delay, parameters = nk.complexity_delay(signal_, delay_max=100, show=True, method="theiler1990")

for opt_dim_method in ['cd', 'afn', 'fnn']:
      optimal_dimension, info = nk.complexity_dimension(signal_,
                                                    delay=delay,
                                                    dimension_max=20, # Default 10, 15 is good
                                                    method=opt_dim_method,
                                                    show=True)

# Using Theilers method gives a delay of 31. Just use a static value
delay_ = 31

# Using cd = 2, afn=4 and fnn=3. we'll chose fun as we now the Lorenz system is 3 dimensional
opt_dimension = 3

# Final goal is to compute the Largest Lyapunov Exponent for this time series. Can't use Eckmann method as reported broken.
lle, info = nk.complexity_lyapunov(signal_,
                                   delay=delay,
                                   dimension=opt_dimension,
                                   method="rosenstein1993",
                                   len_trajectory=20,
                                   show=True)

3) Expected outcome So I had expected this to give me value close to the known LLE of the Lorenz system of ~0.9036 (the full spectra is: 0.9036597330935988 -0.0011701995833755246 -14.569053990750257 calculated using the Bennetin algorithm).

The output from the call to nk.complexity_lyapunov() above yields a result of

lle: 0.02048061575362412 
info: {'Dimension': 3, 'Delay': 31, 'Separation': 151, 'Method': 'rosenstein1993', 'Trajectory_Length': 20}

I suspect that this is 'one' of the LLE's computed at a specific time point but perhaps it would be prudent to put the full spectra of points into another dictionary item. within info.

4) System Specifications Output of nk.versio()

Any help gratefully received.

Thanks, Anthony

welcome[bot] commented 1 year ago

Hi 👋 Thanks for reaching out and opening your first issue here! We'll try to come back to you as soon as possible. ❤️ kenobi

DominiqueMakowski commented 1 year ago

Thanks for getting in touch.

I can confirm that the plot works :

image

If it doesn't show up try adding plt.show() after the function call.

I suspect that this is 'one' of the LLE's computed at a specific time point but perhaps it would be prudent to put the full spectra of points into another dictionary item. within info.

That seems reasonable. Would you mind looking into the function and tell me what to change (or please consider making directly a PR).

In general, I'm far from being an expert on this, so I'd appreciate any ways of improving / fixing / correcting this function.

DominiqueMakowski commented 1 year ago

image

I have made a PR (#906 ) and added a new method, "makowski" that computes the slope only up to the changepoint. Before I merge, could you give it some testing ?

contentsciences commented 1 year ago

image

I have made a PR (#906 ) and added a new method, "makowski" that computes the slope only up to the changepoint. Before I merge, could you give it some testing ?

Hi Dominique,

Yes no problem. I'll test over the weekend. I've thought about the core issue I reported last week. If we change the function so that it returns a set of average divergences, we can then plot them, in a similar manner to your example above. By plotting them and drawing a line with polyfit or curvefit, the user can then see the slope more readily.

As an example, here is a plot I made using a different library and then by examining the plot, I then compute the slope and intercept and plot as a straight line by choosing only values of time where the average divergence is at the change point. (in the example plot, the line equation y = 0.89314 * x + -0.79691, where the slope 0.89314 is very close to a calculated LE using Benettin method of 0.90)

LLE-Lorenz-rho28

As I noted, I'll do some testing on this new method over the weekend.

Thanks,

Anthony

DominiqueMakowski commented 1 year ago

PS: you can install the PR using pip install https://github.com/neuropsychology/NeuroKit/zipball/pull/906/head

stale[bot] commented 8 months ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.