ML4ITS / mtad-gat-pytorch

PyTorch implementation of MTAD-GAT (Multivariate Time-Series Anomaly Detection via Graph Attention Networks) by Zhao et. al (2020, https://arxiv.org/abs/2009.02040).
MIT License
328 stars 76 forks source link

losses are always nan #12

Closed m-ali-awan closed 2 years ago

m-ali-awan commented 2 years ago

Hi, hope you are fine. Thanks for this wonderful work. I tried training with MSL, and SMD, and my losses are always nan. Moreover, I also tried GDN repo, and I found that there is a difference in MSL data as compared to this repo. Thanks for any help.

Regards, Ali

m-ali-awan commented 2 years ago

I have further dived into code and came to know that outputs from TemporalAttentionLayer are nan always, and from FeatureAttentionLayer some come out as nans.

axeloh commented 2 years ago

If you haven't already, I suggest you ensure that

  1. input data has the correct format
  2. data itself does not contain nan and are scaled properly before input to the model.

Regards, Axel

ylic204 commented 2 years ago

I have further dived into code and came to know that outputs from TemporalAttentionLayer are nan always, and from FeatureAttentionLayer some come out as nans.

Have you solved this problem yet? I also have this problem, During training, the losses are all nan,and there are many zeros in the data. And I don't understand why the num_values in labeled_anomalies.csv and the shape in the .npy file in the train folder are different. For example, C-1 in labeled_anomalies.csv is 2264, but C-1.npy is 2158. '2264' and '2158' don't match.

JinYang88 commented 2 years ago

Same problem with the SMD dataset, using the default hyperparameters.

JinYang88 commented 2 years ago

Setting use_gatv2 to False can produce normal loss.

ghost commented 2 years ago

Check out my answer on #13, I believe it is due to uninitialized bias parameters.

JinYang88 commented 2 years ago

Check out my answer on #13, I believe it is due to uninitialized bias parameters.

It works for me now, great!