Arn-BAuA / TimeSeriesAEBenchmarkSuite

This Repository contiains a benchmark environment to compare various autoencoders on multiple time series datasets.
GNU General Public License v3.0
2 stars 0 forks source link

Time Series Autoencoder Benchmark Suite

About this Repo and Basic Overview

This repository houses a benchmark environment for reconstruction based multivariate time series anomaly detection algorithms, mainly autoencoders. The intendet usecase for the repository is to be included as a submodule in other projects. In addition to the benchmark functionality the model provides, the object oriented interfaces defined here can also be used, to quickly build novel algorithms like e. g. ensemble methods.

Core Building Blocks:

The model is build around the following modular buildingblocks. A brief explanation of the block and their role is given below. For more indepth information on the diffrent blocks, check out the README files in the corresponding folders.:

A normal benchmark run can be automatically conducted without further coding, using the Benchmark.py script. If all the interfaces above are implmented as designed, the Benchmark.py takes the datasets, a model, a trainer and a selection of performance metrics and automatically trains and evaluates the model and logs the performance characteristics. The QuickOverview.py method can process this output afterwards to create a visualisation.

Additional Building Blocks and Functionality:

Here are some auxilary building blocks, which are not required for the core functionality of the repository but rather provide functionality that can come in handy during experiments:

Hyperparameters in this Repo:

The modst basic required parameters for the building blocks are passed as arguments. The rest of the parameters is passed as a dictionary of hyperparameters. The reasoning behind this is as follows: Dictionarys are sort of self documented (depends of the names of the keys of course.) and they can be easilie saved as .json. Using this functionality, the parameters passed to the building blocks are always saved along the other performance parameters when a benchmark run is conducted. THis way, the results of the benchmark run function as additional documentation.
To make the handling of the hyperparameters that are passed as dictionary easier for the user implementing the interfaces provided in this framework, the handling is described in teh parent class "block". Trainer and Model inherit directly form block. The data sources don't directly inherit a common class. The data sets however must be instances of datablock, a class, that defines how datasets should behave in this framework.

Installing the Framework

At the moment, there is only an installation script for linux. Users of other operating systems can use it as a guidline to install the framework. We are working on providing installation scripts for every platform.
The installation script sets up a virtual environment, containing all the classes necessairy to run the framework. In addition, it downloads and manages the data sets that have to be present for the set wrappers to function.
Here is a list of things You need to prepare for a smooth installation:

Working with the Framework

The experiment script takes the buildingblocks defined above and does an experiment. Individual experiments are stored in the experiments folder but are ment to be executed in the root of the repo. In the Benchmark.py file are a bunch of methods to create the nessecairy dicts and save the hyperparameters and environment data. Here a small example scipt, benchmarking a feed forward autoencoder on syntehtic sine data and visualizing the output:


#!/bin/python

from Models.FeedForward import Model as FeedForwardAE
from DataGenerators.Sines import generateData as Sines
from Trainers.SingleInstanceTrainer import Trainer as OnlineTrainer

from Benchmark import benchmark,initializeDevice
from Evaluation.QuickOverview import plotOverview

pathToSave = "Tech Demo"

device = initializeDevice()
Dimensions = 2 # Dataset dimensions

trainingSet,validationSet,testSet = Sines(Dimensions)

model = FeedForwardAE(Dimensions,device)

trainer = OnlineTrainer(model,device)

resultFolder = benchmark(trainingSet,
          validationSet,
          testSet,
          model,
          trainer,
          n_epochs=40,
          pathToSave=pathToSave,
          device = device)

plotOverview(resultFolder)

The scripts in the Evaluation dict are used to evaluate the data on the fs generated by the experiment. In evaluation is a subdirectory that is called Utility_Plot which contains templates for often used plots and an directory called Utility_Data which encapsulates some helpers for loading and saving data.