arthurherbout / crypto_code_detection

Automatic Detection of Custom Cryptographic C Code

8 stars 4 forks source link

Benchmarking Detection Performance With WindRiver #33

Open arnaudstiegler opened 4 years ago

arnaudstiegler commented 4 years ago

We have had our first results with our models, and as we are investigating about what they actually learned, having a regex benchmark would be a good way of assessing whether those models actually improve on the performance you could get with a regex.

If someone has the time to run a quick experiment to see what performance we get from WindRiver, that would be awesome!

arnaudstiegler commented 4 years ago

Just to make sure that the aim of that is clear, the main point of doing this is to see whether our models can do better than a regex search. If they can't, either or both the approach and the data are wrong.

Besides, it raises an interesting question: for what cases will regex search fails? (which are exactly the cases we are aiming at)

redouane-dziri commented 4 years ago

Will do, probably won't be as "quick" as one might think getting around their APIs with our wonderful file structure etc. haha but worth the time !

redouane-dziri commented 4 years ago

So Wind River's crypto detector is not packaged in any way, so the only way to add it to the repo is to actually copy-paste the whole thing within our repo... (and I would like to have it in the repo for the sake of reproducibility at least)

Should I add a folder benchmark and put it in there? I would also mention in the READMe a disclaimer that it is mostly not our code - I will have to make a couple changes to simplify the API to do only what we need it to do. The tool is under the Apache License, 2.0 so we can basically do whatever we want with it I think. Would love your opinion on this @arnaudstiegler

redouane-dziri commented 4 years ago

We decided to run Wind-River as is and extract information from the outputs in the repo only, rather than tweak their code or incorporate any of their stuff in here.

38 addresses this issue

arnaudstiegler commented 4 years ago

Would it be easy to reuse your code on another set of data (I'm thinking the full wolfssl package)?

redouane-dziri commented 4 years ago

Yes :) would need to run Wind-River on it separately, add the output to the folder in models, change the sources array to incorporate it as a new source and from line 95 instead of reading full_data.json you would need to read the json generated from wolfssl instead

redouane-dziri commented 4 years ago

You'll find findings and exploratory analysis of the benchmark's outputs in models/benchmark/explore_benchmark_results.ipynb.

Some key findings:

nearly twice as many false positives as false negatives
false positives have mislabeled files from others
false positives have short headers that don't implement anything from others
false positives have headers that only declare variables that will be used in some kind of cryptography protocols in others
false positives contain key authentication programs in others
false positives contain OS code from others
false negatives contain files from crypto-competitions that implement bitwise shifts and operations used for cryptographic purposes
false negatives contain files from crypto-library that contain just lists of digits (not even hexadecimals) - headers
false negatives contain files from crypto-library that contain algorithms to conduct cryptographic operations rooted in mathematical structures - in absolute terms the functions and operations defined in the file have no reason to be called crypto on their own...
a lot of matching is done on very generic terms like crypt or cipher
only two code-jam were misclassified based on a treacherous variable name
crypto-library files were matched primarily on some known crypto libraries patterns, some protocols and algorithms
others files were matched mostly on generic strings but also a lot on OpenSSL
the crypto-competitions files were overwhelmingly matched on generic clues and then on a variety of algorithms (hardly any protocols)