Closed 1ncludeSteven closed 9 months ago
Of course:
timestamp
, event
and machine
are sufficient for DeepCASE to perform the analysis process. However, if security operators manually analyze the event sequences, it may be useful to them to have auxiliary information such as src/dst IP. However, for the process that DeepCASE performs, this is not required.A B B
with attention values 0.2 0.3 0.5
, then the vector used for clustering looks like this 0.2 0.8
(or in other words A: 0.2 B: 0.3 + 0.5 = 0.8
. Regarding the case for BitCoinMiner, I cannot comment on specific instances due to our NDA. I can say that we observed many clusters that all had pretty much the exact same sequence/attention and some clusters where there was some variance in the sequences.I hope this answers your question.
For question 2, do events in the same cluster have similar attention values? Does similar attention value mean that the events in it are similar? For example, it may be Phishing: Financial Sector
, Phishing: Payment Service
, Phishing: Social Networking
and Phishing: e-Commerce
in the same cluster because these events are similar. I'm curious about what type of events are in the same cluster.
No, as of right now, all event types are treated as being completely different from one another. That means that two clusters
Phishing: Financial Sector
and 0.7 attention to Phishing: Payment Service
; andPhishing: Social Networking
and 0.7 attention to Phishing: e-Commerce
will be considered to have no overlap.It can be the case however that events may be clustered together in some cases, if there is enough overlap in the attention of the same events. Consider the following case (I abbreviate the phishing examples to fin
, pay
, soc
, and com
):
fin
, fin
, fin
, pay
, Event pay
and attention vector [0.03, 0.03, 0.03, 0.91]
(corresponding to context)com
, com
, com
, pay
, Event pay
and attention vector [0.03, 0.03, 0.03, 0.91]
(corresponding to context)
In this case the first sequence is represented as fin:0.09, pay:0.91
and the second sequence as com:0.09, pay:0.91
which has a big overlap on the pay
and may therefore be clustered together.Thank you for your response, it was helpful!!
I recently had the chance to delve into your paper and replicate the findings using the data provided in DeepLog. I was intrigued by the process and have a couple of inquiries that I hope you could help me with:
Regarding the Context Builder stage, I wanted to clarify if providing only the timestamp, event, and machine for each warning is sufficient. Are additional details like the source IP, destination IP, etc., unnecessary for this stage?
Concerning the clustering process, I'm curious if the clustering is primarily based on the attention vector derived from each event. If that's the case, do the events within the same cluster exhibit similarities akin to BitCoinMiner? Or, are there instances where some events are somewhat similar to BitCoinMiner?
Your insights on these points would be immensely valuable. Thank you very much for your time and consideration.