Cisco-Talos / binary_function_similarity

MIT License
249 stars 29 forks source link

Binary Function Similarity

This repository contains the code, the dataset and additional technical information for our USENIX Security '22 paper:

Andrea Marcelli, Mariano Graziano, Xabier Ugarte-Pedrero, Yanick Fratantonio, Mohamad Mansouri, Davide Balzarotti. How Machine Learning Is Solving the Binary Function Similarity Problem. USENIX Security '22.

The paper is available at this link.

Additional technical information

The technical report, with additional information on the dataset and the selected approaches, is available at this link.

Artifacts

The repository is structured in the following way:

What to do next?

The following is a list of the main steps to follow based on the most common use cases:

How to cite our work

Please use the following BibTeX:

@inproceedings {280046,
author = {Andrea Marcelli and Mariano Graziano and Xabier Ugarte-Pedrero and Yanick Fratantonio and Mohamad Mansouri and Davide Balzarotti},
title = {How Machine Learning Is Solving the Binary Function Similarity Problem},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {2099--2116},
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/marcelli},
publisher = {USENIX Association},
month = aug,
}

Errata corrects

Our corrections to the published paper:

License

The code in this repository is licensed under the MIT License, however some models and scripts depend on or pull in code that have different licenses.

Bugs and feedback

For help or issues, please submit a GitHub issue.