HelenGuohx / logbert

log anomaly detection via BERT
MIT License
240 stars 102 forks source link

Why did Baselines performance decline so much from the performance shown in their papers? #13

Closed SycIsDD closed 2 years ago

SycIsDD commented 3 years ago

Congratulations on completing such a remarkable job. Your work has provided us with a lot of information. With learning about your work, I encountered some problems like this. I ran the code you provided, and I read your paper. Deeplog and LogAnomaly perform much worse than they do in their papers. Is a new evaluation method adopted? And I noticed that in your implementation it uses the Drain method to extract the templates, whereas in Deeplog it uses Spell. Also, when running the data you use with DeepLog and the data it generates with Spell, f1-Score drops by 10%. However, this performance is still better than your baseline implementation. I would like to ask you to explain the reason for this situation. Thank you.

zmymjmm commented 2 years ago

so why Deeplog and LogAnomaly perform much worse than they do in their papers?

HelenGuohx commented 2 years ago

I think the performance difference is due to the code implementation. The code for Deeplog and Loganomaly in my paper is based on donglee-afar/. If you have access to the source code from the authors of Deeplog and Loganomaly, please let me know.