Open hit9 opened 10 years ago
Pull requests accepted...
Seasonal algorithms are hard to automatically fit. Working on it, though...
On May 8, 2014, at 12:25 AM, 王超 notifications@github.com wrote:
For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.
But skyline thinks 10 is normal.
— Reply to this email directly or view it on GitHub.
A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.
I am looking inside now ..
Yep! That's what I was leaning towards - use FFT to get periodicity, and maybe use that to populate an ARIMA or use a KS test along windowed intervals? cc @toufic
On May 8, 2014, at 6:01 AM, 王超 notifications@github.com wrote:
A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.
I am looking inside now ..
— Reply to this email directly or view it on GitHub.
I'm not so sure of the last question, but the method to detect periodicity, I get some information from : http://stackoverflow.com/questions/15261122/determine-frequency-from-signal-data-in-matlab
And, this function may help:
def guess_period(x):
x = np.array(x)
n = np.size(x)
m = np.mean(x)
p = np.abs(np.fft.fft(x - m))
i = np.argmax(p)
if i:
return n / float(i)
this might gives a series's period, but some fails:
>>> x = [1, 20, 2, 20, 1, 21, 2, 22, 1, 19]
>>> guess_period(x)
2.0
>>> import itertools
>>> source = itertools.cycle([1, 10, 20, 10, 1])
>>> x = [source.next() for _ in range(101)]
>>> guess_period(x)
5.05
>>> x = [source.next() for _ in range(103)]
>>> guess_period(x)
4.904761904761905
>>> x = [source.next() for _ in range(105)]
>>> guess_period(x)
1.25 # fails
I think, we can maintain a dict ({period: hit_times}
), the period that hit most wins.
Awesome. You can use Crucible (github.com/astanway/crucible) to refine the algorithm.
Any progress forward on this ?
Hi @astanway , I have created another monitor similar with to skyline https://github.com/eleme/node-bell , it's only for periodic metrics. And the algorithm used is only 3-sigma. Thanks, for this project giving me lot of ideas!
For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.
But skyline thinks 10 is normal.