Can you share the code for preprocessing and explain meaning of each index of data?

Hi NoPainNoCode,

Thanks for your question!

I've added a demo ipython notebook in datasets/ folder. Please have a look there for the detailed pre-processing procedure. In summary, we only standardised the time series by removing the mean and normalising by the standard deviation of the original time series.

As for the meaning of the specific features in the loaded data, I will list the explanation below:

t - timestamp for each reading in the time series.
t_unit - unit for the interval between two consecutive timestamps.
readings - the original time series values; same as the time series loaded from the original .csv file.
idx_anomaly - indices where the anomalies occurred; computed from the anomaly timestamps from the original .csv file.
idx_split - indices between which the training set is created. We took a section of the original time series where no anomalies have occurred as the training set.
training - normalised time series for the training set.
test - normalised time series for the test set.
t_train - indices for the training set readings.
t_test - indices for the test set readings.
idx_anomaly_test - indices for the anomalies in the test set.

Hope this explanation is helpful for you!

Best wishes, Lin

lin-shuyu / VAE-LSTM-for-anomaly-detection

Can you share the code for preprocessing and explain meaning of each index of data? #3