` #Look ahead look behinde
def create_lookab_model(self, train_data):
pairs = list()
for seq in train_data:
p = self.slice_lookahead_pairs(seq)
pairs.extend(n)
self.n_gram_counter += Counter (pairs)
Want to use sets here not just store the winner like ngram
#Produce required lookahead pairs. E.g. With sequence [e1 e2 e3 e4 e5] and we produce [e1 e2], [e1 e3], [e1 e4], [e1 e5]
def slice_lookahead_pairs(self, seq):
#Add SoS and EoS with n-gram 3 it is SoS E1 E2 E3 EoS
seq = [self._start_] +seq+[self._end_]
lookahead_pairs = []
for i in range(len(seq)):
for j in range(i+1, len(seq)):
pair = [seq[i], seq[j]]
# Convert into a line
line = ' '.join(pair)
# Store
lookahead_pairs.append(line)
return lookahead_pairs`
Inoue H, Somayaji A. Lookahead pairs and full sequences: a tale of two anomaly detection methods. InProceedings of the 2nd Annual Symposium on Information Assurance 2007 Jun 6 (pp. 9-19). https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=58e0f19f8662c5750a2c41ea94855987fdd387a0#page=18
Hubballi N. Pairgram: Modeling frequency information of lookahead pairs for system call based anomaly detection. In2012 Fourth International Conference on Communication Systems and Networks (COMSNETS 2012) 2012 Jan 3 (pp. 1-10). IEEE. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=9caacf4865215c4031dc96e09ea4b7667784b0f1
Something like this if in ngram class
` #Look ahead look behinde def create_lookab_model(self, train_data): pairs = list() for seq in train_data: p = self.slice_lookahead_pairs(seq) pairs.extend(n) self.n_gram_counter += Counter (pairs)
Want to use sets here not just store the winner like ngram