wuyifan18 / DeepLog

Pytorch Implementation of DeepLog.
MIT License
372 stars 154 forks source link

Number of sessions in test data #26

Closed yip522364642 closed 4 years ago

yip522364642 commented 5 years ago

Thanks for sharing this code firstly. But I have a question about the number of sessions in test data. For my understanding, in both train and test data, each row represents one session or one block. So the number of sessions of hdfs_test_normal / hdfs_test_abnormal are 553365 / 16838 respectively. But after running "LogKeyModel_predict.py", it shows the following output,

Number of sessions(hdfs_test_normal): 14177 Number of sessions(hdfs_test_abnormal): 4123

I wander if my understanding is wrong, or the output is given wrong?

I will always be grateful for your or others reply. Thank u again.

yip522364642 commented 5 years ago

hhh I know the reason~~ hv been shown in the code

inesani commented 4 years ago

Hi @yip522364642 I am having the same issue here. Can you tell me what I am missing ?

yip522364642 commented 4 years ago

Hi @yip522364642 I am having the same issue here. Can you tell me what I am missing ?

If you run the code "set(XXX)", you will get the same result like what i hv mentioned. Try to run "[XXX]", because "set()" is help to delete the same session, and the session size will smaller.

wuyifan18 commented 4 years ago

@yip522364642 quite right! @inesani I have mentioned the reason in the code. https://github.com/wuyifan18/DeepLog/blob/a3824218b71b2a3fa0dd04963058f7e0c17a18c3/LogKeyModel_predict.py#L19-L20

inesani commented 4 years ago

thanks, that was it :)

wuyifan18 commented 4 years ago

You are welcome.