sintel-dev / Orion

A machine learning library for detecting anomalies in signals.
https://sintel.dev/Orion/
MIT License
1.04k stars 160 forks source link

How to remove "mlstars.custom.timeseries_preprocessing.time_segments_aggregate" from pipeline for custom data #558

Closed aloner-pro closed 1 month ago

aloner-pro commented 3 months ago

Description

Can I remove times_segments_aggregate since my data is already equi-spaced with a weekly frequency. Here is the data

timestamp,signal,anomaly
1357171200,406.5,0
1357776000,417.0,0
1358380800,421.0,0
1358985600,429.0,0

Due to it I think the anomalies in my data are wrong.

What I Did

Here is the hyperparameters what I set.

hyperparameters = {
    "mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
        "time_column": "timestamp",
        "interval": 604800
    },
   'orion.primitives.aer.AER#1': {
        'epochs': 5,
        'verbose': True
    }
}

The anomaly interval is nothing close to what is should be. Please guide me through.

sarahmish commented 3 months ago

Hi @aloner-pro, thank you for using Orion!

If your data is already equi-spaced, then the primitive time_segments_aggregate will not change your data. In other words, what you did was correct to set the interval parameter to the actual space of your time series.

As for what you expect the anomaly interval to be, can you elaborate more on that? Was the ground truth anomaly enclosed within the detected anomalies?

There are certain parameters that need to be adjusted based on what you expect. For example, if you find that the detected anomalous range is too big, you can reduce the padding size. Here is an example:

hyperparameters = {
    "orion.primitives.timeseries_anomalies.find_anomalies#1": {
        "anomaly_padding": 1,
    }
}

More on the postprocessing primitives can be found in our documentation.