ML4ITS / mtad-gat-pytorch

PyTorch implementation of MTAD-GAT (Multivariate Time-Series Anomaly Detection via Graph Attention Networks) by Zhao et. al (2020, https://arxiv.org/abs/2009.02040).
MIT License
328 stars 76 forks source link

out_dim or n_features #3

Closed ZhixuanLiu closed 3 years ago

ZhixuanLiu commented 3 years ago

https://github.com/ML4ITS/mtad-gat-pytorch/blob/8f907ccef2695252a20db6d32536cc07f3fc53e8/mtad_gat.py#L62

self.recon_model = ReconstructionModel(window_size, gru_hid_dim, recon_hid_dim, out_dim, recon_n_layers, dropout) should the "out_dim" be changed to "n_features" to match the shape of input x while the loss is calculated by MSELoss(recons, x) ??

axeloh commented 3 years ago

We do it like this because we do not always want to forecast and reconstruct all input features. For example, for SMAP and MSL datasets, the input consists of one-hot encoded data plus one (continuous) telemetry value. So here we do not want to forecast and reconstruct the one-hot encoded features, only the telemetry value.

https://github.com/ML4ITS/mtad-gat-pytorch/blob/8f907ccef2695252a20db6d32536cc07f3fc53e8/utils.py#L40-L53 That is why we for MSL and SMAP use [0] as target dim (dim of the telemetry value).

In train.py the out_dim is set based on this: https://github.com/ML4ITS/mtad-gat-pytorch/blob/8f907ccef2695252a20db6d32536cc07f3fc53e8/train.py#L54-L70

To get correct loss during training (forecastings and recons are compared against the correct input features) we must also check for this: https://github.com/ML4ITS/mtad-gat-pytorch/blob/8f907ccef2695252a20db6d32536cc07f3fc53e8/training.py#L111-L115

ZhixuanLiu commented 3 years ago

Thanks for the explanation. I did miss that part in train.py. It helps a lot.