d-ailin / GDN

Implementation code for the paper "Graph Neural Network-Based Anomaly Detection in Multivariate Time Series" (AAAI 2021)
MIT License
481 stars 141 forks source link

Improvement idea: Just using numerical variables for graph deviation scoring? #57

Open DevBySam7 opened 1 year ago

DevBySam7 commented 1 year ago

Hello, thanks again for publishing this cool project! If I understand the code right, you use the 'mean squared error' to calculate the loss for both: categorial and numerical variables. As far as I know: state of the art is creating an 'output branch' for every categorial variable and then using a crossentropy loss function for each of those. Since there are quite a lot categorial features in both SWAT and WADI: Do you think it would be an improvement if you would just use the predictions of the numerical variables for the graph deviation scoring?

d-ailin commented 1 year ago

Thanks for your interest. I think it is an interesting question, and it could probably be divided into two sub-questions:

  1. Would it be better to use some different loss or specialize some other operations for categorical variables?
  2. Should we just use the predictions of the numerical variables for the scoring?

For Q1, I think it would be yes if more detailed consideration is taken in the dealing with different types of variables/data. For Q2, I am not quite sure about it, as in some real-world cases, there could exist some abnormal case that only happens in some categorical variables, only using the predictions of the numerical variables could incur False Negatives.