d0ng1ee / logdeep

log anomaly detection toolkit including DeepLog
MIT License
387 stars 115 forks source link

Display original log with results? #8

Open jmlane8 opened 4 years ago

jmlane8 commented 4 years ago

Do you have anything that displays the original log records, their ground truth status as normal and abnormal, and the result from logdeep predictions?

cherishwsx commented 4 years ago

I can help with this one. :)
I think you can find the original log data in loghub. You can find the HDFS (specifically HDFS_1) data and BGL data along with their labels in the corresponding folders. For the logdeep predictions, are you saying the evaluation result? The result is shown in the Benchmark results section in README.md.

cherishwsx commented 4 years ago

Forgot to put the link. loghub

jmlane8 commented 4 years ago

Thank you. I wanted to run the abnormal and normal predictions, and be able to point back to the original unstructured log records, and say: the neural network picked up something abnormal here.

cherishwsx commented 4 years ago

Thank you. I wanted to run the abnormal and normal predictions, and be able to point back to the original unstructured log records, and say: the neural network picked up something abnormal here.

I think you can actually print out the block_id (which is the event sequence identifer in HDFS dataset) or row number when there is a abnormal record detected. Looking at the inference part script predict.py might help.

cherishwsx commented 4 years ago

When I was thinking about the "tracking back to raw log records" problem, it seems to me like there is no way to actually track record by record (more of a streaming analysis) since we are training and predicting on event sequence, instead of every log records/single event. So I guess we can only know which event sequence is abnormal, right? And it's more suitable for batch log analysis?

Correct me if I'm wrong and any ideas are welcome! @donglee-afar

d0ng1ee commented 4 years ago

You are right, @cherishwsx I think if you understand the pipeline of log anomaly detection, this is a very simple job ...