LRydin / MFDFA

Multifractal Detrended Fluctuation Analysis in Python
MIT License
128 stars 27 forks source link

Negative H values when running mfdfa #22

Closed FuadYoussef closed 3 years ago

FuadYoussef commented 3 years ago

Hi, Thank you for all of the hard work you've done to make this library. I've tried following the instructions in the documentation as well as the instructions in this post https://github.com/LRydin/MFDFA/issues/20 in order to analyze some of my own data and have encountered an issue. For reference, the code I am using is in this repository in the main.py and file_reader.py files. https://github.com/FuadYoussef/ChineseSentenceLengths

When I run the MFDFA and attempt to plot the data, I am able to generate the following graph: Figure_1

However the H values I get by subtracting 1 from each of the slopes is this: [0.24513458672164035, 0.2186072164297126, 0.1926072404229875, 0.17172103657125448, -0.34609042498381115, -0.3619098348631248, -0.375020070287282, -0.38637649591195067]

I obtain these H values using this code: slopes = np.polyfit(np.log(lag), np.log(dfa), 1)[0] slopes = slopes.tolist() hExponents = [] for slope in slopes: hExponents.append(slope - 1)

My confusion is about why I am getting negative H values from this. Any help or suggestions you could give would be greatly appreciated. Thank you.

LRydin commented 3 years ago

Hey @FuadYoussef, thank you for the question. I'll try my best to give you an answer.

So, I think you code is absolutely correct and the figure you show looks like a typical plot from MFDFA. Now the issue if of course the interpretation of the Hurst coefficient, which is not too easy. You are using sentence lengths, which I am not familiar with, but nevertheless, the point here has to do with having to subtract -1 to get the Hurst coefficient H. This is correct to do so for stochastic processes, but not for all processes.

Let me illustrate this using Espen Ihlen's fantastic paper on MFDFA: Introduction to multifractal detrended fluctuation analysis in Matlab, Ihlen, E.A.F, Front. Physio. 3:141. This is Espen Ihlen's Fig, 8

Screenshot from 2021-06-11 13-59-22

Where you can see the fluctuation function for (B) a monofractal time series and (C) white noise. (B) will have H>1 (this is why one subtracts -1, to get the mathematically correct H value), but (C) has 0<H<1, and it is a perfectly reasonable time series. I think your results fall in the category of "noise", thus you shouldn't subtract 1 from your results.

Two notes:

I hope this helps a little bit. From my view it seems to me your data is slightly multifractal, which seems reasonable from my understanding of sentences lengths.

Feel free to ask more things, if you have any doubts!

FuadYoussef commented 3 years ago

Thank you so much for this. Your description of the issue really helps here and I'm getting proper results now. I really appreciate it.

LRydin commented 3 years ago

:) Glad to help