TomasTomecek / cve-data

Research project to process CVE data
0 stars 0 forks source link

CVE Research

This is a research project of CVE fixes.

Files in this repo

In the current state we are only interested in Python projects and CVE fixes that take place in a single code file. More complex fixes are skipped for now.

How this works?

We clone high-profile python projects and try to find CVE string in the commit messages.

JSON structure (cve-data)

Upstream project metadata

Downstream project metadata

Not available yet.

JSON structure (django and cpython backport data)

Running the script with get-django-backports or get-cpython-backports will produce a json output containing data on how django developers backported some of the fixes. Both are amazing sources for this data as they maintain multiple parallel streams. These are not just CVE backports, but also regular bug fixes.

Example run

You need to have python3 and git available on your system to run it.

$ ./script-upstream.py
[
  {
      "cve_id": "CVE-2023-...

This will print a json-formatted metadata described above. The script will clone selected high-profile python projects and analyze them. It creates a directory called "workspace" where the analysis will happen. Please be aware the git repos take several gigabytes on a disk.