Vulnerability locations index

Hello, my name is Ben Steenhoek and I am a PhD student at Iowa State University studying deep learning-based vulnerability detection. Thank you for making this dataset available and easy to use.

I want to use your corpus of programs for the DARPA Cyber Grand Challenge to train a neural network model to detect buggy code, such as null-pointer dereferences or buffer overflows. To do this, I provide the model with the source code of the program and the location of the vulnerability. For example, if the vulnerability is a crash, I mark the statement which causes the crash, such as a segmentation fault caused by a null pointer dereference. In order to collect a large dataset of vulnerable programs, I can only use the vulnerability location if it's in a machine-readable format such as XML or CSV.

Since the cyber grand challenge evaluated several systems, I would expect there's some level of automated checking. However, I do not see a machine-readable index of vulnerable locations. This repo only includes a natural language description of each vulnerability in README.md. How can I access a machine-readable index of the vulnerability locations? I would be grateful for your help in making use of this wonderful dataset.

trailofbits / cb-multios

Vulnerability locations index #94