sintel-dev / Orion

A machine learning library for detecting anomalies in signals.
https://sintel.dev/Orion/
MIT License
993 stars 159 forks source link

[TadGAN benchmark] F1-Score is very low. #486

Open gunnha opened 7 months ago

gunnha commented 7 months ago

What I Did

known_anomalies = pd.DataFrame()
for signal in train_55:

    str1 = f'{signal}'
    df = load_anomalies(str1)

    known_anomalies = pd.concat([known_anomalies, df], axis=0)

# Merge signal
X_train_msl = pd.DataFrame()
X_test_msl = pd.DataFrame()
for signal in train_55:

    train_signal_path = f'multivariate/{signal}-train'
    test_signal_path = f'multivariate/{signal}-test'
    #train_signal_path = f'{signal}-train'
    #test_signal_path = f'{signal}-test'

    train_df = load_signal(train_signal_path)
    test_df = load_signal(test_signal_path)

    X_train_msl = pd.concat([X_train_msl, train_df], axis=0)
    X_test_msl = pd.concat([X_test_msl, test_df], axis=0)

hyperparameters = {
    "mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
        "time_column": "timestamp",
        "interval": 21600,
        "method": "mean"
    },
    "orion.primitives.tadgan.TadGAN#1": {
        "epochs": 70
    },
    "orion.primitives.tadgan.score_anomalies#1": {
        "rec_error_type": "dtw",
        "comb": "mult"
    }
}

orion = Orion(
    pipeline='tadgan',
    hyperparameters=hyperparameters
)

orion.fit(X_train_msl)
anomalies = orion.detect(X_test_msl)

contextual_f1_score(known_anomalies, anomalies, X_test_msl)

Question

sarahmish commented 7 months ago

Hi @gunnha – did you find the answers you're looking for?

gunnha commented 7 months ago

Hi @gunnha – did you find the answers you're looking for?

I thought I found it at first, so I closed it. I'm still looking for it...

I found that Variation changes can be handled like the code above. The only concern now is the f1-score.

There are 55 channels of MSL data (ex. P-11, D-15, M-7). I merged all these channels into one DataFrame. Using the code above, the f1-score is approximately 0.1.

Secondly, I tried to take f1-score by fitting on individual channels, and some of them came out as nan. Instead, some of them came out as well as papers.

Is there any other good way? @sarahmish Thank you for your interest.

sarahmish commented 7 months ago

@gunnha to reproduce the results in the paper, we use the benchmark function provided in Orion. We do a couple of things differently there:

  1. we process each signal on its own (no concatenation).
  2. we use weighted=False option for the evaluation metrics.
  3. we aggregate the results on a dataset level.

As for the Yahoo data, you need to directly request access from their website to obtain their data.

The code for reproducing the benchmark can be found in benchmark.py and to aggregate the results refer to results.py.

Let me know if you have any further questions!