Open pombredanne opened 3 years ago
Somehow I completely missed this. The data is collected as a set of Breaking Commits and Fixing Commits for each vuln from various sources that provide that information (google, red hat, debian, ubuntu, etc..) Then for each vuln the first vulnerable version is determined and then through a serious of git manipulations the breaking commits are translated for each stream that might be vulnerable, that provides the first vulnerable version for each stream. The same is done for the breaking commits which gives you a range for each stream (ie 4.15.3 up to 4.15.92). There are various nuances that are slightly more complicated, for instance vulnerabilities that cause by improper backporting where the mainline isn't vulnerable or vulnerabilities that different fixes in one stream than other. But in general thats the process.
As far as making the code public. That is on my to do list when the day job slows down a bit.
@nluedtke This seems awesome! Are you using and abusing any git bisect for this? And is you code in shell and Python?
As far as making the code public. That is on my to do list when the day job slows down a bit.
@nluedtke gentle ping .... it has been a few years ;) any update?
https://github.com/nluedtke/linux_kernel_cves has a very nice set of correlated data where the upstream Linux kernel versions are handled, likely inferred from distro advisories.
@nluedtke I am curious about how you create the data in the first place? You wrote:
It would be awesome to see the code too.