sintel-dev / Orion

Library for detecting anomalies in signals
https://sintel.dev/Orion/
MIT License
1.05k stars 162 forks source link

Create own primives / pipelines #579

Open LuSchnitt opened 1 week ago

LuSchnitt commented 1 week ago

Hello dear Machine-Learning enthusiasts!

I am currentliy using your Orion-Framework for my Master-Thesis (Anomaly-Detection for Energy Meter-Reading) and want to create an own pipeline (or new primitives).

For example the tadgan-pipeline contains the following primitives:

"primitives": [
    "mlstars.custom.timeseries_preprocessing.time_segments_aggregate",
    "sklearn.impute.SimpleImputer",
    "sklearn.preprocessing.MinMaxScaler",
    "mlstars.custom.timeseries_preprocessing.rolling_window_sequences",
    "orion.primitives.timeseries_preprocessing.slice_array_by_dims",
    "orion.primitives.tadgan.TadGAN",
    "orion.primitives.tadgan.score_anomalies",
    "orion.primitives.timeseries_anomalies.find_anomalies"
],

In this pipeline-steps, there is for example step 2: sklearn.impute.SimpleImputer. Can i just use any other compatible function from scikit like gridsearchCV or CrossValidation or even whol scikit-Pipelines?

Anyway, awesome framework!

Best

Lukas

sarahmish commented 1 day ago

Hi @LuSchnitt! Thanks for using Orion!

To use any scikit-learn function/class as a primitive, you need to have an associated json file. Most of the primitives we have are stored in mlstars package, which you can see here. Unfortunately, GridSearchCV is not one of them. If you are keen on integrating such primitive I would encourage that you contribute to our libraries!

You can also build your own primitive. I have written documentation on it here.

Please let me know if you have any comments or further question.

LuSchnitt commented 1 day ago

Hi @sarahmish, thanks for replying so fast!

Awesome, thanks for showing me the links to the doc, i didn’t find it there by myself.

Yeah i would like to contribute, not sure if i get everything done by the end of the year for contributing. But im sure the gridSearch.py will be coming.

Best Lukas