Automated Vulnerability Scoring and Categorisation Toolset for Vulnerability Reports.
Vulnerability severity scoring and categorisation using machine-learning tools. VulnerabilityClassifier is an open-source toolkit that employs machine-learning techniques to learn vulnerability labels assigned by NVD, vendors, cvedetails, and other repositories, in order to predict the labels for new vulnerability reports. Here, "labels" refers to CVSS-metric labels, threat types provided by cvedetails, weakness types provided by CWE, and attack types provided by CAPEC. The purpose is to support a higher level of automation in vulnerability assessment.
We generate some datasets for CWE/CAPEC/CVSS/threat classification training purposes in another repo: NVD Data Feature Analysis
The recommended environment is Python 3. The tutorials need Jupyter Notebook (by Anaconda Navigator).
The purpose here is to be able to automatically assign a severity score to any vulnerability instance with a descriptive report, using the CVSS Version 3 standard. Two examples are shown below, whereby the TestingSamples have labels initially set as (CVSS score = 0) and other values as "l", and the labels of the PredictedSamples are predicted by the trained machine-learning models.
A severity computation pipeline that streamlines the process of machine-learning model training, testing, and validation is illustrated in the CVSS V3 Notebook, in a step-by-step manner.
Step 1: Clone the repo using the following command:
git clone https://github.com/Yuni0217/VulnerabilityClassifier.git
Step 2: Create a virtual environment.
Step 3: Install requirements using pip
:
pip install -r requirements.txt
Step 4: Download datasets from NVD feeds.
python ./CVSSV3prediction/updateDB.py
Step 5: Train machine-learning models for different CVSS V3 mechanisms and store them.
python ./CVSSV3prediction/trainScoreCVSSV3.py
Step 6: Using the trained machine-learning models to predict CVSS V3 scores for any vulnerability document.
python ./CVSSV3prediction/predictScoreCVSSV3.py -p './CVSSV3prediction/testData' -s -v
Similarly, vulnerability severity score under CVSS Version 2 can be predicted using trained machine-learning model.
The model training, testing, validation process is illustrated in the CVSS V2 Notebook, in a step-by-step manner.
Threat categories that one vulnerability might be exposed to can be predicted using trained machine-learning model. With accuracy shown below (without any optimisation yet).
The model training, testing, validation process is illustrated in the Threat Prediction Notebook
Before using the tutorial Threat Prediction Notebook, you can also update the data to be synchorinised with the latest vulnerability data feeds, and create mappings between CVEs and threat types in cvedetails with the following scripts:
python ./threatPrediction/updateDB.py
python ./threatPrediction/cveIDcrawler_in_cveDetails.py
python ./threatPrediction/generateThreatTrainingData.py
If you use this tool in your academic work you can cite it using
@article{jiang2022towards,
title={Towards automatic discovery and assessment of vulnerability severity in cyber--physical systems},
author={Jiang, Yuning and Atif, Yacine},
journal={Array},
volume={15},
pages={100209},
year={2022},
publisher={Elsevier}
}