Why go through "all malicous labels + one possible subject"?

purseclab / ATLAS

ATLAS: A Sequence-based Learning Approach for Attack Investigation

Apache License 2.0

140 stars 55 forks source link

Hi Lesley,

in suggest_ground_truth() I aim at extracting all unique abstracted (i.e., tokenized) sequences, malicious or not. so basically result_list returns with the unique malicious & non-malicious sequences, that's why it is not sufficient to only extract sequences from the malicious entities. As you can see in the Lines 394-399 Before we assign a sequence as a malicious, we check if the tokenized sequence matches a malicious sequence, if not then we assign it as a non-malicious "0", otherwise, we assign it as a malicious sequence "1". Thanks.

purseclab / ATLAS

Why go through "all malicous labels + one possible subject"? #10