Open Chiradipb02 opened 8 months ago
after completing the training for 35 epochs:--
Epoch: 1/35, Losses: {'cx_loss': -4.8175, 'cz_loss': -4.2402, 'eg_loss': 9.4242} Epoch: 2/35, Losses: {'cx_loss': -83.4143, 'cz_loss': 0.4275, 'eg_loss': -11.9605} Epoch: 3/35, Losses: {'cx_loss': -118.1121, 'cz_loss': 1.6406, 'eg_loss': -82.1093} Epoch: 4/35, Losses: {'cx_loss': -96.1586, 'cz_loss': 1.2744, 'eg_loss': -240.1017} Epoch: 5/35, Losses: {'cx_loss': -171.8538, 'cz_loss': 2.1408, 'eg_loss': -488.9143} Epoch: 6/35, Losses: {'cx_loss': -41.5881, 'cz_loss': -4.3703, 'eg_loss': -728.5376} Epoch: 7/35, Losses: {'cx_loss': -50.8743, 'cz_loss': 2.2937, 'eg_loss': -936.7202} Epoch: 8/35, Losses: {'cx_loss': -100.3114, 'cz_loss': -2.4848, 'eg_loss': -517.6544} Epoch: 9/35, Losses: {'cx_loss': -99.0258, 'cz_loss': 2.6696, 'eg_loss': -1218.6364} Epoch: 10/35, Losses: {'cx_loss': -21.0454, 'cz_loss': 1.6388, 'eg_loss': -101.3644} Epoch: 11/35, Losses: {'cx_loss': -46.9198, 'cz_loss': -1.7119, 'eg_loss': -350.9029} Epoch: 12/35, Losses: {'cx_loss': -6.086, 'cz_loss': 2.2132, 'eg_loss': -1045.5497} Epoch: 13/35, Losses: {'cx_loss': -391.7562, 'cz_loss': -1.0262, 'eg_loss': -677.0828} Epoch: 14/35, Losses: {'cx_loss': 194.6997, 'cz_loss': 2.7028, 'eg_loss': -686.2322} Epoch: 15/35, Losses: {'cx_loss': -1681.5021, 'cz_loss': 2.1015, 'eg_loss': -1501.0836} Epoch: 16/35, Losses: {'cx_loss': -2299.2453, 'cz_loss': -0.7873, 'eg_loss': -1303.4165} Epoch: 17/35, Losses: {'cx_loss': -1361.4151, 'cz_loss': 1.9527, 'eg_loss': -1605.7816} Epoch: 18/35, Losses: {'cx_loss': -177.5991, 'cz_loss': 1.1469, 'eg_loss': -1359.632} Epoch: 19/35, Losses: {'cx_loss': -3798.2821, 'cz_loss': 2.6662, 'eg_loss': -2312.6722} Epoch: 20/35, Losses: {'cx_loss': -10793.4622, 'cz_loss': 0.8595, 'eg_loss': -4401.6898} Epoch: 21/35, Losses: {'cx_loss': 4510.4191, 'cz_loss': 0.6463, 'eg_loss': -3757.3169} Epoch: 22/35, Losses: {'cx_loss': 54413.2969, 'cz_loss': -2.2831, 'eg_loss': -4157.8031} Epoch: 23/35, Losses: {'cx_loss': 3813.4309, 'cz_loss': -0.0306, 'eg_loss': -3582.8478} Epoch: 24/35, Losses: {'cx_loss': 1734.2098, 'cz_loss': 2.5015, 'eg_loss': -3394.293} Epoch: 25/35, Losses: {'cx_loss': 1301.2073, 'cz_loss': -1.1375, 'eg_loss': -5869.3654} Epoch: 26/35, Losses: {'cx_loss': 3266.5693, 'cz_loss': 2.1572, 'eg_loss': -13029.8635} Epoch: 27/35, Losses: {'cx_loss': 2313.6272, 'cz_loss': -0.0213, 'eg_loss': -18815.4439} Epoch: 28/35, Losses: {'cx_loss': 3218.0648, 'cz_loss': 3.0745, 'eg_loss': -25817.8698} Epoch: 29/35, Losses: {'cx_loss': 3368.6121, 'cz_loss': 3.1033, 'eg_loss': -38295.3714} Epoch: 30/35, Losses: {'cx_loss': 535.1069, 'cz_loss': -1.5381, 'eg_loss': -43057.3359} Epoch: 31/35, Losses: {'cx_loss': 485.7542, 'cz_loss': 2.392, 'eg_loss': -44885.8762} Epoch: 32/35, Losses: {'cx_loss': 137.9176, 'cz_loss': 1.2915, 'eg_loss': -46512.0878} Epoch: 33/35, Losses: {'cx_loss': 27.0721, 'cz_loss': 1.6871, 'eg_loss': -46914.7306} Epoch: 34/35, Losses: {'cx_loss': 71.8974, 'cz_loss': 3.2924, 'eg_loss': -47328.6803} Epoch: 35/35, Losses: {'cx_loss': 13.1489, 'cz_loss': -0.9914, 'eg_loss': -47412.2166} 315/315 [==============================] - 35s 109ms/step 315/315 [==============================] - 45s 137ms/step 315/315 [==============================] - 8s 25ms/step
Hi @Chiradipb02 – thanks for opening an issue and using Orion!
To run the benchmark on NASA dataset, you can use our benchmarking script which will automatically load the necessary hyperparmeter settings, i.e. tadgan_smap.json
and tadgan_msl.json
.
from orion.benchmark import benchmark, BENCHMARK_DATA
datasets = {
"MSL": BENCHMARK_DATA["MSL"],
"SMAP": BENCHMARK_DATA["SMAP"]
}
pipelines = {"tadgan": "tadgan"}
scores = benchmark(pipelines=pipelines, datasets=datasets)
You will need some compute for this to complete in a decent time.
You can also find the latest results of the benchmark (which we run every release) available in the details Google Sheets document and the summarized results can also be browsed in the following summary Google Sheets document.
Thank you @sarahmish for your response. I tried to run the code,
from orion.benchmark import benchmark, BENCHMARK_DATA
datasets = {
"MSL": BENCHMARK_DATA["MSL"],
"SMAP": BENCHMARK_DATA["SMAP"]
}
pipelines = {"tadgan": "tadgan"}
scores = benchmark(pipelines=pipelines, datasets=datasets)
but some error occurs, for each of the datasets.
ERROR:mlblocks.mlpipeline:Exception caught fitting MLBlock sklearn.preprocessing.MinMaxScaler#1
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/mlblocks/mlpipeline.py", line 644, in _fit_block
block.fit(**fit_args)
File "/usr/local/lib/python3.10/dist-packages/mlblocks/mlblock.py", line 311, in fit
getattr(self.instance, self.fit_method)(**fit_kwargs)
File "/usr/local/lib/python3.10/dist-packages/sklearn/preprocessing/_data.py", line 427, in fit
`n_samples` or because X is read from a continuous stream.
File "/usr/local/lib/python3.10/dist-packages/sklearn/preprocessing/_data.py", line 450, in partial_fit
if sparse.issparse(X):
File "/usr/local/lib/python3.10/dist-packages/sklearn/base.py", line 600, in _validate_params
self._check_n_features(X, reset=reset)
File "/usr/local/lib/python3.10/dist-packages/sklearn/utils/_param_validation.py", line 97, in validate_parameter_constraints
sklearn.utils._param_validation.InvalidParameterError: The 'feature_range' parameter of MinMaxScaler must be an instance of 'tuple'. Got [-1, 1] instead.
ERROR:orion.benchmark:Exception scoring pipeline <mlblocks.mlpipeline.MLPipeline object at 0x7ab63e38a8f0> on signal M-6 (test split: True), error The 'feature_range' parameter of MinMaxScaler must be an instance of 'tuple'. Got [-1, 1] instead..
for which the output comes:--
pipeline rank dataset signal iteration accuracy f1 recall precision \
0 tadgan 1 MSL M-6 0 0 0 0 0
1 tadgan 2 MSL M-1 0 0 0 0 0
2 tadgan 3 SMAP G-3 0 0 0 0 0
3 tadgan 4 SMAP P-4 0 0 0 0 0
4 tadgan 5 SMAP F-1 0 0 0 0 0
.. ... ... ... ... ... ... .. ... ...
75 tadgan 76 MSL M-7 0 0 0 0 0
76 tadgan 77 MSL D-16 0 0 0 0 0
77 tadgan 78 MSL D-15 0 0 0 0 0
78 tadgan 79 MSL P-11 0 0 0 0 0
79 tadgan 80 SMAP F-3 0 0 0 0 0
status elapsed split run_id
0 ERROR 11.174287 True d4758923-4
1 ERROR 1.174106 True d4758923-4
2 ERROR 1.141722 True d4758923-4
3 ERROR 1.600665 True d4758923-4
4 ERROR 1.554191 True d4758923-4
.. ... ... ... ...
75 ERROR 0.710938 True d4758923-4
76 ERROR 0.675812 True d4758923-4
77 ERROR 0.946997 True d4758923-4
78 ERROR 1.777963 True d4758923-4
79 ERROR 1.651411 True d4758923-4
Is it occuring as some values in the datasets are equal to -1 and 1 or due to absence of--
'sklearn.preprocessing.MinMaxScaler#1': {
'feature_range': (-1, 1)
}
in tadgan_msl.json and tadgan_smap.json ?
the issue is solvable when you downgrade sklearn to 'scikit-learn<1.2'
. After version 1.2, sklearn forces the range to be of tuple
type rather than a list, whilst the .json
file only support list
types.
please make sure to install the compatible version of sklearn pip install 'scikit-learn>=0.22.1,<1.2'
which should fix the issue above!
Thank you @sarahmish for your help, and sorry for the late reply. The code is running properly and giving the appropriate results, but taking a really long time.
One more question:-- When I am trying to use the model on another dataset on kaggle notebook, what type of hyperparameters would I have to choose, so that number of false positives can be reduced and range like anomaly can be detected besides point anomalies.
On a meter reading time-series dataset:-- I have used tadgan pipeline with tadgan_smap.json parameters (as the signals appeared to be somewhat similar):----
hyperparameters_334_61 = {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"time_column": "timestamp",
"interval": 900, # interval between 2 successive timestamps in dataframe is 900
"method": "mean"
},
"sklearn.preprocessing.MinMaxScaler#1": {
"feature_range": (-1,1)
},
"orion.primitives.tadgan.TadGAN#1": {
"epochs": 10
},
#new added hyper-parameter
"orion.primitives.tadgan.score_anomalies#1": {
"rec_error_type": "dtw",
"comb": "mult"
}
}
orion_334_61 = Orion(
pipeline='tadgan.json',
hyperparameters=hyperparameters_334_61
)
For 10 epochs on a particular portion of the dataframe of shape (18286, 2) in orion format
For 25 epochs on the same dataframe
For 5 epochs on the whole dataset of shape (75224, 2) only for this, while training, the e_g_loss was consistently becoming more negative
Epoch: 1/5, Losses: {'cx_loss': -1.4802, 'cz_loss': -0.4702, 'eg_loss': -20.8699}
Epoch: 2/5, Losses: {'cx_loss': -0.7637, 'cz_loss': 2.2239, 'eg_loss': -74.8726}
Epoch: 3/5, Losses: {'cx_loss': -0.6252, 'cz_loss': 2.403, 'eg_loss': -134.6424}
Epoch: 4/5, Losses: {'cx_loss': -0.593, 'cz_loss': 2.4117, 'eg_loss': -158.1663}
Epoch: 5/5, Losses: {'cx_loss': -1.0072, 'cz_loss': 2.3576, 'eg_loss': -263.0882}
2493/2493 [==============================] - 168s 67ms/step
2493/2493 [==============================] - 188s 75ms/step
2493/2493 [==============================] - 28s 11ms/step
for the other 2, the losses did not cross -70
But the anomalous part is mainly the lower flat part. I have tried the fixed threshold parameter (True and False) with similar number of epochs, but no improvement observed. What parameter values can be used here in general?
The loss is unbounded for the critic, therefore, it makes sense to see variance between one time series and another. If you'd like, you can set detailed=True
when training the model such that you can also observe more intuitive losses such as mean squared error.
To reduce/extend the range of the detected anomalies, here is a hyperparameter called anomaly_padding
that defines how many data points to include before and after the point that was considered anomalous. To remove any padding, set the value to zero
hyperparameters = {
'orion.primitives.timeseries_anomalies.find_anomalies#1': {
'anomaly_padding': 0 # set to 50 by default
}
}
for more information, visit the primitive page for find_anomalies.
If all your anomalies look like the flat part of the signal, I think there are simpler algorithms that you try and that are faster than tadgan.
Thank you @sarahmish for your reply. I am using the AER model, for anomaly detection. So far it's performance is really faster than tadgan and giving better results. But while trying the approach given in tulog for tadgan, to see how the primitives are working, defining the aer model requires the parameters---
layers_encoder
layers_decoder
and some hyperparameters. What are the layer architectures that I can pass as parameter to build the model.
To get the intermediate outputs, is there any method like making visualization =True in detect() method as it was there for tadgan in tulog?
Can you please tell what are the functionalities of the hyperparameters 'lower_threshold'
and 'min_percent'
in the primitive find_anomalies
?
Description
I am trying to reproduce the results of the TadGAN model proposed in the paper 'TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks' and perform benchmarking. For the efficient result reproduction for smap and msl spacecraft datas, what hyperparameter values should i use? Or if the trained model weights are available, how can I use them and where to find?
What I Did
I am currently using the hyperparameters given in the tadgan_smap.json file and the tadgan pipeline. But training for even 35 epochs is quite time taking and expensive on colab.
using tadgan pipeline
2 of the losses are diverging
Using tadgan.json pipeline
the runtime gets disconnected in between
Other Approach
There is a txt file in the nasa dataset zip link given in the paper. That txt file contained some model parameters as well. Also in the models folder there were .h5 files for each dataset file. I tried to load on of them to tadgan model, after preprocessig the data as given in the Tulog.ipynb
there was some dimensional error as required shape appeared to be (none,none,25). So I had to reshape the data
the reconstruction from trained model
What can I do?
notebook link: https://colab.research.google.com/drive/1zahCbCImRuL2_Hc-ms1WSZl7oUyP32Q3?usp=sharing