Closed chenbekor closed 8 years ago
The expected time-series is produced from the many time-series models available (src/main/java/com/yahoo/egads/models/tsmm/). Here is a sample test-case for one of the models: egads/src/test/java/com/yahoo/egads/TestAnomalyDetect.java
Indeed I reviewed this file but I'm not sure I understand the flow..
first the test case loads a actual_metric time series (line 40)
then, there is a loop with an inner loop at which another time series is loaded form disk:
src/test/resources/modeloutput" + refWindows[w] + "_" + drops[d] + ".csv"
then, the test case train 3 detection models using the actual vs the expected but both serieses are loaded from disk .... so I can't figure out how to learn from this.
in runtime - I only have the actual time series . how do I produce the expected time series?
seems like there's missing documentation. thanks for helping!
We are populating the expected time-series using the predict() method. Specifically, in the file I referenced previously we have:
OlympicModel model = new OlympicModel(p); model.train(actual_metric.get(0).data); TimeSeries.DataSequence sequence = new TimeSeries.DataSequence(metrics.get(0).startTime(), metrics.get(0).lastTime(), 3600); sequence.setLogicalIndices(metrics.get(0).startTime(), 3600); model.predict(sequence);
The model.predict(sequence) call is the one that uses a forecasting model to populate the expected time-series sequence.
@nlaptev Suppose I want to pass in data of 30 days as my training time series, and then use the model to detect anomaly points for 1 day as my test time series. What should I do?
Should we pass in testDs like so:
ArrayList<Anomaly> anomalyList = ad.detect(ad.metric, forecastDs);
Which would mean we would need to change the function signature from:
ArrayList<TimeSeries.DataSequence> list = ma.forecast( ma.metric.startTime(), ma.metric.lastTime());
to:
ArrayList<TimeSeries.DataSequence> list = ma.forecast(forecastDs.startTime(), forecastDs.lastTime());
What other changes need to me done? Or is there an easier way of doing what I am trying to achieve?
Thanks for helping.
it is not clear from the code how to use the engine. specifically - how to generate the expected series.
in the AnomalyDetectionModel there is the following method:
I'm not sure how the expectedSeries is produced?
in the unit tests it is loaded from a file: src/test/resources/modeloutput" + refWindows[w] + "_" + drops[d] + ".csv" which is also - not understood.
any help is appreciated.