Closed phaabe closed 3 years ago
Hi @phaabe
This is because the data set you are using is very small.
However if you just use data sets that are bit larger, you should see the results your were expecting to see. Transformations and computations of this nature on very small samples may not always work as you expect them too.
You are using 14 values, try with 16 and you will see the result you were expecting.
d1 = [1, 10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 1]
d1Longer = [1, 10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 1]
s = pandas.Series(d1Longer, index=pandas.date_range("2021-01-01", periods=len(d1Longer)))
level_shift_ad = LevelShiftAD(c=6.0, side='both', window=2)
anomalies = level_shift_ad.fit_detect(s)
plot(s, anomaly=anomalies, anomaly_color='red')
Internally the LevelShiftAD algorithm runs through a number of transforms on the data.
The DoubleRollingAggregate is the first and it will calculate very different values for d1
and d2
even in the first RollingAggregate, never mind the second.
s = pandas.Series(d1, index=pandas.date_range("2021-01-01", periods=len(d1)))
s.rolling(2).median()
s = pandas.Series(d2, index=pandas.date_range("2021-01-01", periods=len(d2)))
s.rolling(2).median()
As you can see after running those ^^ d1
and d2
are markedly different on the first rolling aggregrate, I will not go so far as to break each computation and its output as internally Pipenet is quite complex.
Great, thanks
I have created a simple example.
With
d2
two anomalies are detected. Withd1
no anomalies are detected.Why? Or maybe the question must be: What should I look at to understand? :)
Thanks