cisco / joy

A package for capturing and analyzing network flow data and intraflow data, for network research, forensics, and security monitoring.
Other
1.3k stars 329 forks source link

bug with sleuth? #216

Open wuxb09 opened 5 years ago

wuxb09 commented 5 years ago

Hi,

I am facing an unknown issue with sleuth.

First, I use joy to generate the output file with "joy bidir=1 dist=1 classify=1 ../benign/*.pcap > ./benign_classify.gz", which contains 3929 lines.

Second, I do "sleuth benign_classify.gz --select "p_malware" --where "p_malware > 0.49" > temp.txt", then temp.txt has 57 lines. And if do "sleuth benign_classify.gz --select "p_malware" --where "p_malware < 0.5" > temp.txt", it has 514 lines in the temp.txt. Hence now you can see 57+514 is way less than 3929, does this mean sleuth has bug?

By the way, I have written a very simple python program to verify it as the following, and the results match 3929 lines.

Best wishes, Xiaoban

import json
import sys
import gzip
if __name__ == "__main__":
    if len(sys.argv) != 4:
        print "Error using this program"
        exit()
    json_file = sys.argv[1]
    mode = sys.argv[2]
    target = float(sys.argv[3])
    count = 0
    with gzip.open(json_file,'r') as fp:  
        for line in fp:
            tmp = json.loads(line)
            if 'version' in tmp:
                continue
            t = float(tmp['p_malware'])
            if mode == ">":
                if t > target:
                    count = count + 1
            elif mode == ">=":
                if t >= target:
                    count = count + 1
            elif mode == "<":
                if t < target:
                    count = count + 1
            elif mode == "<=":
                if t <= target:
                    count = count + 1
        print count
bhudson33 commented 5 years ago

is this still an issue?