wuyifan18 / DeepLog

Pytorch Implementation of DeepLog.
MIT License
361 stars 154 forks source link

Case of single log message in a process. #20

Closed sandeepvvn closed 5 years ago

sandeepvvn commented 5 years ago

When the data has sequences of length less than a typical window size of 3. The sequences are determined as anomalies, though the sequences are already present in training data. Making the window size less than 3 doesn't make sense for syslog data. But some process have 1 ,2 or 3 logs over time , they are been taken as anomalies by the lstm.

How can we handle these sequences and not generate more false positives

wuyifan18 commented 5 years ago

Do you mean the sequences like 5 22 in hdfs_test_abnormal?

sandeepvvn commented 5 years ago

lets think i train "5 22" sequence and put it in normal . It is still a false positive

I have done experimentation but to come to conclusion, need answer to this basic question. What will be an ideal window size, how do we determine it from the data?

wuyifan18 commented 5 years ago

Why it is a false positive?