honeynet / cuckooml

CuckooML: Machine Learning for Cuckoo Sandbox
https://honeynet.github.io/cuckooml/
146 stars 52 forks source link

Make CuckooML plotting dependant on library imports #15

Open So-Cool opened 7 years ago

So-Cool commented 7 years ago

In the try: import... create a global variable for all the libraries necessary for plotting and condition CuckooML plotting on that. The result: no need to install plotting packages if you're only interested in malware analysis with textual output.

greninja commented 7 years ago

can we add a 'raw_input()' prompting for 'if the user wants to use plotting or not' and import the required libraries only on a 'yes'?

So-Cool commented 7 years ago

Not really, given that we want to automatically analyse large malware datasets this could potentially cause inconvenience. A better approach would be to give one plotting switch in conf/cuckooml.conf, but still check for imports in case someone is running an analysis and forgot to install the plotting packages; in that case all the computation time will go in vain as the code will crash while attempting to plot something before reporting any useful results.

greninja commented 7 years ago

The checking for imports of plotting libraries is taken care by the try and catch block.

As far as giving a plotting switch is concerned, don't the "figures" argument in detect_abnormal_behavior() and "plot" argument in clustering_label_distribution() do exactly that?

So-Cool commented 7 years ago

At the moment they are all imported in one block. For instance pandas is necessary for the module to work, but matplotlib is only needed if you want to plot something. Separating these into blocks responsible for particular CuckooML's functionality is probably what we want to do.

So-Cool commented 7 years ago

@greninja, this is good beginning, but there are couple of issues with your contribution.

First of all, plotting variable is missing in the conf/cuckooml.conf.

In detect_abnormal_behaviour figures is by default set to True and in clustering_label_distribution plot is set by default to False.
In both these functions there has to be a safety check for plotting. If somebody sets any of these variables to True but Config("cuckooml").cuckooml.plotting is set to False than both these functions should overwrite plotting variable to False and possibly print some sort of warning.

Are you willing to fix these?

greninja commented 7 years ago

Hey @So-Cool,

I actually have added the plotting variable: in commit 80148b4

For the other issue:

Absolutely correct. So if a user sets Config("cuckooml").cuckooml.plotting to False and either of the variables(figures and plot) to true, the libraries wont be imported and the plotting cant be done. My question is : is terminating the program ,when this error occurs, with a warning a good idea or importing the modules there , in the function block, would be appropriate?

So-Cool commented 7 years ago

Sorry @greninja, I've missed that commit.

Terminating is not particularly good idea; I guess people would be annoyed if it takes a lot of time to crunch the data and then they are left with nothing because they have forgotten to install the plotting libraries.
On the other hand, if just a warning is printed, the computation will finish and they can produce the plots later based on the classification outcome that has been saved to a file. Therefore, in such a case I would opt for overwriting these variables to False and printing a warning message.

greninja commented 7 years ago

I have made the changes apprised by you.

Though I made a mistake while pushing commits. I am really sorry. Hence I had to close the PR and open it again.

Also while running 'cuckooml.py' I am facing 'ImportError: No module named lib.cuckoo.common.config' . How do I rectify it?

So-Cool commented 7 years ago

I haven't come across lib.cuckoo.common.config ImportError, @greninja. How do you run it? What's your PYTHONPATH?

greninja commented 7 years ago

I run it normally: python cuckooml.py. I did add the projects path to the .bashrc file like: export PYTHONPATH="$PYTHONPATH:/home/shadab/cuckooml/" but it doesnt seem to work.

So-Cool commented 7 years ago

Alright, you shouldn't run cuckooml.py directly from modules/processing directory. The correct way is to be in cuckooml root directory and in Python interpreter do import modules.processing.cuckooml.