EvoTestOps / LogLead

LogLead stands for Log Loader, Enhancer, and Anomaly Detector.
MIT License
10 stars 3 forks source link

Lookahead pairs #9

Open mmantyla opened 7 months ago

mmantyla commented 7 months ago

Inoue H, Somayaji A. Lookahead pairs and full sequences: a tale of two anomaly detection methods. InProceedings of the 2nd Annual Symposium on Information Assurance 2007 Jun 6 (pp. 9-19). https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=58e0f19f8662c5750a2c41ea94855987fdd387a0#page=18

Hubballi N. Pairgram: Modeling frequency information of lookahead pairs for system call based anomaly detection. In2012 Fourth International Conference on Communication Systems and Networks (COMSNETS 2012) 2012 Jan 3 (pp. 1-10). IEEE. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=9caacf4865215c4031dc96e09ea4b7667784b0f1

Something like this if in ngram class

` #Look ahead look behinde def create_lookab_model(self, train_data): pairs = list() for seq in train_data: p = self.slice_lookahead_pairs(seq) pairs.extend(n) self.n_gram_counter += Counter (pairs)

Want to use sets here not just store the winner like ngram

#Produce required lookahead pairs. E.g. With sequence [e1 e2 e3 e4 e5] and we produce [e1 e2], [e1 e3], [e1 e4], [e1 e5] 
def slice_lookahead_pairs(self, seq):
    #Add SoS and EoS with n-gram 3 it is SoS E1 E2 E3 EoS
    seq = [self._start_] +seq+[self._end_]

    lookahead_pairs = []
    for i in range(len(seq)):
        for j in range(i+1, len(seq)):
            pair = [seq[i], seq[j]]
            # Convert into a line
            line = ' '.join(pair)
            # Store
            lookahead_pairs.append(line)
    return lookahead_pairs`