jaketae / deep-malware-detection

A neural approach to malware detection in portable executables
MIT License
75 stars 16 forks source link

Further analysis #11

Open AidenRheem opened 9 months ago

AidenRheem commented 9 months ago

Hello, I have gone through the steps of training the model with the benign and malware datasets. What would be the further steps I could take to analyze this data into a graph or a chart? I saw some images and files of code that presumably can do this, but the ReadMe file does not have any information on how to do so.

jaketae commented 9 months ago

Hi @AidenRheem, thank you for opening this issue.

It has been a while since I've worked on this project, and unfortunately I do not have access to the server that I used to run the experiments and scripts.

If I recall correctly, I generated some of the figures in the repository mostly using the visualization code in utils.py on a Jupyter notebook. It appears that I have not committed the notebook however (likely because it was messy and unorganized).

I know this isn't too helpful, but let me know if you have any other questions. Thanks!

AidenRheem commented 9 months ago

@jaketae Thank you for the reply, After running the code in train.py, to generate such figures, all I would need to do to run the utils.py file is to runpython utils.py in the terminal? Specifically, I want to generate a plot roc curve.

jaketae commented 9 months ago

Hi @AidenRheem, apologies for the late reply.

To plot the ROC curve, you would have to import the plot_roc_curve function in utils.py, so some code writing is required (you can't just run python utils.py).

The function signature is

def plot_roc_curve(models, test_loader, save_title, device):

so you would have to supply the right arguments to the function. The main ones are the trained model (nn.Module) and the test data loader. save_title can be whatever you'd like, e.g., "roc.png", and device would either be "cpu" or "cuda".

Code-with-u-know-who commented 5 months ago

@jaketae Hi. i would like to know is the code is working right now. and instead of run locally do u suggest any platform to execute this else where??

jaketae commented 5 months ago

Hi @Code-with-u-know-who, I've only tested my code early last year, and there is a chance some of it may not work. Specifically, the scrapers are likely broken, as websites tend to change every now and then. However, the PyTorch modeling and training code should work, as long as you have the right dataset. Another user reported in #12 that the code is working as of last month and were able to train a model successfully.

I was developing and training the models on an Ubuntu machine, so I'd recommend that (or something like Colab if you don't have a server).