wxdang / MSCRED

tensorflow implement the paper A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data
60 stars 28 forks source link

information lost #2

Open francisduan opened 4 years ago

francisduan commented 4 years ago

For line 44 at generation_signature_matrice.py

for t in range(win, self.signature_matrices_number):

This does not make sense since your are just looping the first 2000 data. Is the result correct if you lose so much information?

sangramkapre commented 4 years ago

Signature matrices combine 10 points into single 3D tensor. So 2000 matrices means the code is using all of the 20K data points. Hope this helps.

huangJC0429 commented 4 years ago

Excuse me, I didn't find the operation of combine 10 points into single 3D tensor before that.

sangramkapre commented 4 years ago

Sorry, my bad. Actually, there is a parameter called gap_time which indicates how frequently to sample data and create signature matrices. It is set to 10 by default, thus 2000 matrices are getting created from 20K points. You can change the gap_time depending on how many matrices you want to create. But entire data is being used for sure, as construction of each signature matrix at a particular time point takes into account past 60 points (max window size), so technically the information is not lost.

gorold commented 3 years ago

I suppose its meant to be:

for t in range(self.signature_matrices_number):
        adj_t = t * util.gap_time + win
        raw_data_t = raw_data[:, adj_t - win:adj_t]
        signature_matrices[t] = np.dot(raw_data_t, raw_data_t.T) / win