logpai / loglizer

A machine learning toolkit for log-based anomaly detection [ISSRE'16]
MIT License
1.28k stars 426 forks source link

Problem running demo script without labels #67

Closed simphide closed 4 years ago

simphide commented 4 years ago

When I try to run the demo script without labels it doesn't work.

(py36) ubuntu@ubuntu:~/loglizer/demo$ python PCA_demo_without_labels.py
====== Input data summary ======
Loading ../data/HDFS/HDFS_100k.log_structured.csv
Traceback (most recent call last):
  File "PCA_demo_without_labels.py", line 24, in <module>
    split_type='sequential', save_csv=True)
  File "../loglizer/dataloader.py", line 102, in load_HDFS
    print(y_train.sum(), y_test.sum())
UnboundLocalError: local variable 'y_train' referenced before assignment

Does anyone have an idea?

sovitagar commented 4 years ago

Hi, if you are using a .npz extension for your log_file, then it wouldn't be a problem as y_train gets initializes in the block "if log_file.endswith('.npz'): ". But it seems you aren't. Also, since you are executing the code PCA without labels, the block "if label_file:" isn't executed and consequently y_train doesn't get initialized. Now since "print(y_train.sum(), y_test.sum())" is outside any of the blocks, it is executed normally and fails. I would rather recommend moving the print statement inside the "if label_file:" block. This would solve the problem.