cisco / joy

A package for capturing and analyzing network flow data and intraflow data, for network research, forensics, and security monitoring.
Other
1.31k stars 329 forks source link

Cannot run model.py #214

Closed wuxb09 closed 5 years ago

wuxb09 commented 5 years ago

Hi,

I am trying to run model.py to generate a new set of params, however I am encountering the following issue. I have double checked that there is indeed malware.gz under malware_train and benign.gz under benign_train, which are genrated by Joy. I see that the numbers of positive and negative are both zero, is there something wrong with opening the generated files?

Thanks very much for your help. Xiaoban

/joy/analysis$ python model.py -m -l -t -p ../benign_train/ -n ../malware_train/ -o params.txt Num Positive: 0 Num Negative: 0

Features Used: Metadata (7) Packet Lengths (100) Packet Times (100) Total Features: 207

/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning. FutureWarning) Traceback (most recent call last): File "model.py", line 150, in main() File "model.py", line 146, in main learn_param(data, labels, args.output) File "model.py", line 49, in learn_param logreg.train(data, labels) File "/home/acanets/joy/analysis/classifier.py", line 58, in train self.logreg.fit(data,labels) File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1285, in fit accept_large_sparse=solver != 'liblinear') File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 756, in check_X_y estimator=estimator) File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 552, in check_array "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

wuxb09 commented 5 years ago

I think the generated file by the command "./bin/joy bidir=1 dist=1 ../malware/*.pcap > ../malware_train/malware.gz" is not zipped file, hence in "data_parser.py" it reports error though the "try" has not raised any errors.

After I change "with gzip.open(json_file,'r') as fp:" to "with open(json_file,'r') as fp:", it works fine.

bhudson33 commented 5 years ago

IF this is being performed on the latest code in the repo, when issuing the configure command make sure to add the "-gzip-enabled" flag so that gzip is turned on.

./configure -enable-gzip