Open makinno opened 1 year ago
Hi @makinno – thanks for using Orion!
I will try to address each question separately and please reply if you still have questions.
To use custom data, edit the interval
hyperparameter of the primitive mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1
which is detailed in this documentation.
Then to set interval
, based on what I can see from the first 5 entries here, the average gap between one entry and another is ~ 0.00420.
(optional) I also recommend converting the timestamp
column into an integer, which can be done by multiplying the column with 10^6.
As for multi-variate modeling, all pipelines in orion accept multi- input but uni- output. Therefore, if you would like to detect anomalies in both X-axis Acceleration
and Y-axis Acceleration
, you will need to create two separate pipelines and change target_column
in the second model to 1 instead of 0.
hyperparameters = {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"interval": 0.0042
},
"mlstars.custom.timeseries_preprocessing.rolling_window_sequences#1": {
"target_column": 0
}
}
I address this question in the previous point.
You have numerous pipelines to select from, but here is a code for AER
hyperparameters = {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
'interval': 0.0042
},
'orion.primitives.aer.AER#1': {
'epochs': 5,
'verbose': True
}
}
orion = Orion(
pipeline='aer',
hyperparameters=hyperparameters
)
orion.fit(data)
Hello @sarahmish
Thank you for your prompt and helpful response to my questions. Your guidance and detailed explanations were instrumental in resolving the challenges I encountered, and I greatly appreciate your support.
Given that the X-axis acceleration and the Y-axis acceleration data are correlated, I'd like to be able to detect anomalies on both axes using a single ML model. My question is, once the two separate pipelines have been created, is there a way to combine the pipelines when training?
Currently, my code is as follows:
# Define hyperparameters for X-axis Acceleration
hyperparameters_X = {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"time_column": "Timestamp",
"interval": 4399
},
"mlstars.custom.timeseries_preprocessing.rolling_window_sequences#1": {
"target_column": 0
},
'orion.primitives.aer.AER#1': {
'epochs': 5,
'verbose': True
}
}
# Define hyperparameters for Y-axis Acceleration
hyperparameters_Y = {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"time_column": "Timestamp",
"interval": 4399
},
"mlstars.custom.timeseries_preprocessing.rolling_window_sequences#1": {
"target_column": 1
},
'orion.primitives.aer.AER#1': {
'epochs': 5,
'verbose': True
}
}
orion_X = Orion(pipeline='aer', hyperparameters=hyperparameters_X)
orion_Y = Orion(pipeline='aer', hyperparameters=hyperparameters_Y)
# Fit the models separately
orion_X.fit(train_data[['Timestamp', 'X-axis Acceleration']])
orion_Y.fit(train_data[['Timestamp', 'Y-axis Acceleration']])
I have an additional question regarding the test data. Do the test data require the same preprocessing as the training data? For your information, the test data has exactly the same initial CSV format as the data I used for training.
I would appreciate your insights on this matter. Your expertise and assistance are highly valued.
The two pipelines are trained separately, however, the correlation between x-axis and y-axis data is taken into consideration since the pipeline takes both time series as input during the training phase.
Yes, test data should be similar to the training data format. You will need to apply the same transformations to the timestamp column!
Hi @sarahmish ,
I'm currently testing the Orion pipeline, and I encountered an issue with the results. When running the pipeline, the output displayed nothing. I'm unsure if this indicates that there are no anomalies in the data I provided or if there might be an issue with my training data.
Here's a snippet of the testing data format I used:
Timestamp X-axis Acceleration Y-axis Acceleration
2987 0.117860 0.027049
7191 -0.375057 -0.010952
11339 -0.375057 0.065050
15514 -0.033807 -0.010952
19987 -0.640474 0.027049
I've also attached an image of the results for reference.
Could someone please advise on whether the absence of results suggests no anomalies in the data or if there might be an issue with my training data? Thank you in advance for your assistance!
Description
I am interested in using the Orion library for detecting anomalies in multivariate time-series data. Specifically, I would like to implement the available ML pipelines in Orion with my custom dataset, which is in the following CSV format:
I have reviewed the notebook tutorials and documentation, but I couldn't find a clear solution for integrating my custom data into the Orion pipelines.
Specific Questions:
I appreciate any help or guidance you can provide in getting started with Orion for my specific use case.
Thank you for your support!