Difference between train data and test_normal data

riyu94 commented 5 years ago

Hi Wufiyan,

I am able to convert structured logs to numbers as your data files and then I feed the train data to LogKeyTrain, model is generated. In the detection stage, the question is what should be the test_normal data. I tried providing a set of logs with similar logkeys but with different blk_id, So the results is not that good. And If I provide test_normal as train_data, I get 100% accuracy, which is obvious. Could you guide me on how the test_normal data should be.

Great help. Thank you.

riyu94 commented 5 years ago

Hi Wufiyan,

Can you help in understanding what is the num_classes in the code used for?

wuyifan18 commented 5 years ago

@riyu94 num_classes is the number of the log key categories. You can run dataViewing.py to see it. The result is as below:

wuyifan18 / DeepLog

Difference between train data and test_normal data #16