Open alishan2040 opened 11 months ago
Hi @alishan2040. Thanks for your attention.
Now the codes in BertEncoder.py
can generate sentence embeddings as same as the output of word2vecEncoder.py
. In this repository, we utilize one off-the-self Bert model released by https://github.com/llSourcell/bert-as-service. There are some simple instructions to help you follow it:
BertEncoder.py
. @humanlee1011 thank you for your reply. That worked.
I have another question about the reproducibility of the work. Keeping the same settings and the data distribution as mentioned in SwissLog paper, I was able to obtain an F1-score ~65% on the full HDFS dataset. More in detail:
However, it is reported ~ 99% on the HDFS dataset in the paper. Could you please comment on this? Thanks
@humanlee1011 thank you for your reply. That worked.
I have another question about the reproducibility of the work. Keeping the same settings and the data distribution as mentioned in SwissLog paper, I was able to obtain an F1-score ~65% on the full HDFS dataset. More in detail:
- The templates were extracted by means of SwissLog offline log parser
- Generated bert encodings from these extracted templates and trained the model
- Obtained an F1 score ~ 65% on test set.
However, it is reported ~ 99% on the HDFS dataset in the paper. Could you please comment on this? Thanks
@alishan2040 For the issue of the detection results not meeting expectations, what solutions do you have?
Hi @humanlee1011, thank you for providing the implementation.
I am currently working with the
HDFS
dataset and attempting to compute sentence embeddings using the Bert Encoder. However, after reviewing the code in the repository, it appears that the current version only supports the word2vectorEncoder model for computing sentence embeddings usingFastText
.I also have noticed that there is a
BertEncoder.py
file in the repository that is used to generate template embeddings. Can I use this file to extract semantic information for log sentences in the same way that is done inword2vecEncoder.py
? If so, could you please provide guidance or documentation on how to implement this using theBertEncoder.py
file?Thanks