Closed eromoe closed 9 months ago
Is there any simple method to evaluate the hurst exponent ? so that I can filter out the best package.
Hi @eromoe. :wave: Unfortunately, there is a lot of confusing terminology around the Hurst exponent. It also took me a while to remember and dig up all that information after reading your post, but here are the reasons why the two implementations you found are different from nolds.
This is actually a different algorithm. It computes the generalized hurst exponent according to an Algorithm by DiMatteo et al. (2003). This algorithm is implemented as mfhurst_dm
in the current (unpublished) version of nolds (which you can install directly from the repository with pip install git+https://github.com/CSchoel/nolds.git
).
However, for several reasons, I don't recommend using it and using mfhurst_b
instead. This version is the mathematical foundation of mfhurst_dm
by Barabási and Vicsek (1991). One of the reasons is that Di Matteo et al. start with a linear detrending by default. For stock data this is sensible, but for other data it might not be desirable and be better left as a separate pre-processing step. In your example, this also explains why you get a value of 0.5 instead of the prescribed value of 1.0 for a random walk, which has a strong positive correlation between elements. Basically, the detrending turns the random walk back into a series of random data points.
This version indeed seems to be implementing Hurst's original rescaled range approach. However, you have to be careful with the parameters here. Looking into the code, I found the parameter kind
for the "kind" of data that you have and the helper function __get_simplified_RS()
. The default is set to random_walk
, which again leads to the removal of a linear trend, effectively reverting the random walk back to just random data points with neither a positive nor a negative correlation. Hence the 0.5 as output.
Nolds doesn't make any assumptions on the kind of data that you have. It just implements the rescaled range algorithm as described by H. E. Hurst with the Anis-Lloyd-Peters correction factor.
I compared it to several reference implementations and found a good and in some cases perfect agreement between those implementations and my code:
Additionally, I reproduced an example published by Ian L. Kaplan for a series with a Hurst exponent of 0.72. You can find this in nolds
in the unit test test_hurst_pracma
in nolds.test_measures
.
Thanks for your detailed explanation.
But doesn't 0.5
indicate a series may be random walk? 0.5 ~1
means there is a trend . That's why I was confused of nolds.hurst_rs
return value nearby 1.
And for compute_Hc
, it is the no simplified RS would remove linear trend : https://github.com/Mottl/hurst/blob/6ca8f5fb5a8dacfbec4c2df9485d267f3cef25a6/hurst/__init__.py#L62
You're welcome. :smile:
For the Hurst exponent obtained with the rescaled range approach, 0.5 is the expected value for drawing independent samples from a random distribution. Hurst puts it like this in his Nature letter from 1957:
If, however, the quantities considered are entirely independent events, such, for example, as would arise from tossing a set of, say, twelve coins and recording the differences between the number of heads and number of tails at each throw, R is represented by the following equation:
R/sigma = 1.25 sqrt(N)
In that example, each set of twelve coin tosses is entirely independent of the other ones. In a random walk, however, you take the cumulative sum of random numbers, making the values in the random walk highly correlated. In general, the next value in the series will depend more on the previous value than on the random increment that is applied at that step. Hence, a Hurst exponent close to 1 is the expected result.
Both __get_RS
and __get_simplified_RS
call the helper function __to_inc
, which does the detrending, if kind
is "random_walk"
. You might get the pure Hurst exponent without any assumptions about the data type if you set kind
to "change"
, but I'm not fully sure about that. There is a lot of preprocessing going on there.
Oh, I understand. I thought diff
preprocess is nessisary for caculation of hurst exponent, acutually that only imply trainning data is acculmulated.
Thank you very much :)
Hi,
I checked three implement include of nodls, the result are quite differents. My target is labeling the
mean-reversion
andtrending
ranges, I have no clue for the output of nolds.hurst_rs (all > 0.9).testing code
I also tested with rolling window 200 , second one seems most correctly ( contain values below 0.5 )