JuliaDynamics / ChaosTools.jl

Tools for the exploration of chaos and nonlinear dynamics
https://juliadynamics.github.io/DynamicalSystemsDocs.jl/chaostools/stable/
MIT License
187 stars 35 forks source link

Suspicious incorrect implementation of `lyapunov_from_data`. #241

Closed onkyo14taro closed 2 years ago

onkyo14taro commented 2 years ago

Discrepancy in the definitions

I think there might be a mistake in the implementation of lyapunov_from_data.

In [Kantz1994], which is referenced in the implementation, the maximum Lyapunov exponent is estimated as follows (Eq. (2.3)):

Eq. (2.3)

On the other hand, the current implementation of lyapunov_from_data is done with the following:

lyapunov_from_data

The difference between the two equations is the position of the logarithm.

Notes

I don't have the [Skokos2016] on which the implementation of lyapunov_from_data is based. However, if [Skokos2016] is a development of [Kantz1994], then I think the formula for estimating the Lyapunov exponent itself is the same in [Skokos2016] and [Kantz1994].

References

[Kants1994] H. Kantz, “A robust method to estimate the maximal Lyapunov exponent of a time series,” Phys. Lett. A, vol. 185, no. 1, pp. 77–87, 1994. [Skokos2016] Skokos, C. H. et al., Chaos Detection and Predictability - Chapter 1 (section 1.3.2), Lecture Notes in Physics 915, Springer (2016)

Datseris commented 2 years ago

Thanks for your concerns. I think the second version is more accurate. Ultimately the MLE is the mean exponential growth of perturbations over a single trajectory. In numerical data the assumption is made that the dataset covers the chaotic attractor and we can use points on the dataset to represent various trajectories. Now the question is, what do you average over? You can average the distance of the trajectories and then calculate its growth (first formula), or you can average the lyapunov exponent of the trajectories, i.e. average over the growths of distances, i.e., second formula. I think the latter is more correct. Actually, I don't just think this, this is an analytic result. This is exactly what the ergodic theorem says.

Datseris commented 2 years ago

If you are interested it should be super easy to make a second version of the current code that uses the first formula, so we can see how much error it creates!

onkyo14taro commented 2 years ago

@Datseris

Thank you very much for your valuable comments! I am a beginner in this field, and your words have helped me learn. As you said, I now understand that the second equation is mathematically more appropriate.

On the other hand, I was getting -Inf results when using lyapunov_from_data for my audio data. I investigated the cause and found that it was due to a few elements with a logarithmic distance of -Inf (i.e., a distance of 0).

It seems extremely unlikely that the two elements will match, but the possibility increases with coarsely quantized data (e.g., audio data recorded with 16-bit integer).

So I think it would be better to have an option in lyapunov_from_data to prevent -Inf. Specifically, I think it would be possible to solve this problem by allowing the user to set a minimum distance threshold (0 by default). How about adding a minimum threshold keyword argument?

Datseris commented 2 years ago

There should be no option from the DynamicalSystems.jl side to help with your problem. Instead, you should just add noise to your data data = data .+ 1e-15randn(length(data)). This is standard practice in the field of nonlinear dynamics: if the values of your data are so strongly discretized (like integers or very small bits of floats), you get 0 distances in embedded spaces, which is of course an artifact. So just add a small amount of noise and it is done.

If you are interested, please add a Pull Request to add this advice to the documentation string of lyapunov_from_data.

onkyo14taro commented 2 years ago

@Datseris Thank you for your comment! I learned a lot again. I will create a pull request for adding the documentation string.