jaketae / deep-malware-detection

A neural approach to malware detection in portable executables
MIT License
75 stars 16 forks source link

Testing this model ? #12

Open RohithYogi opened 7 months ago

RohithYogi commented 7 months ago

How do I test the model by loading the ones in checkpoints - like if I have a PE file - I can feed it to the model and get the result (true/false)?

RohithYogi commented 7 months ago
python -m src.bin.malshare - NOT WORKING
python train.py --benign_dir="../../raw/dll/" --malware_dir="../../raw/dasmalwerk/" - ERROR [Unpickling](https://github.com/jaketae/deep-malware-detection/issues/2)

python -m src.bin.dll
python extract_header.py --input_dir="../../raw/dll/" --output_dir="../../data/dll/"

python -m src.bin.dasmalwerk
python extract_header.py --input_dir="../../raw/dasmalwerk/" --output_dir="../../data/dasmalwerk/"

python train.py --benign_dir="/workspaces/deep-malware-detection/data/dll/" --malware_dir="/workspaces/deep-malware-detection/data/dasmalwerk/"

After I resolved the issues - I am now able to train the model - but that's it If I give a new sample malware to benign then it should predict it right (a boolean) -

How can I test this ?

jaketae commented 6 months ago

Hi, I think what you're looking for is an evaluation/inference pipeline. I don't have a script specifically for that purpose, but you should be able to write one pretty quickly by importing these functions:

https://github.com/jaketae/deep-malware-detection/blob/8c45fc0e967f60de4caa40b8dc144528c0eefb01/src/deep_malware_detection/utils.py#L76-L100

Specifically, you can load your model and create a data loader for your dataset to measure model accuracy.

To get a prediction for a single file, you can take a look at the predict function to see how you can get a boolean label given an input. Basically, if the output of the model is negative, it means a 0, otherwise, it is 1. If you prefer a probabilistic interpretation, you can also cast a sigmoid to the output.

Hope this helps!