openshiftio / openshift.io

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying modern applications.
https://openshift.io
97 stars 66 forks source link

[3] Document steps how to map package identifiers to CPEs (used in NVD data) #1264

Closed jpopelka closed 6 years ago

jpopelka commented 6 years ago

Description

In #1052 we agreed on that if we want to build a source of open data with vulnerabilities we should start with creating a mappings of package identifiers (i.e a string that we use to identify a package) to CPEs (i.e. to what NVD uses to identify product) in order to be able to reliable search data from NVD database for vulnerabilities in packages that we analyse.

Now we want to take the most viable idea and start putting it into actual steps how to proceed. Having these steps we will continue in #1260

Acceptance criteria:

jpopelka commented 6 years ago

There are several ideas from where we can possibly get these mappings. Few can be taken from 'downstreams', some from existing vulnerability databases like VictimsDB which licenses the data under CC-BY-SA. I'll leave these out for now and focus on getting the mappings from CVE texts. As I described in (C) a CVE usually contains a CPE and several references (URLs). If we follow the references, we can (with help of vendor:product from CPE) eventually find out what package_manager:package_name (if any) the CVE affects. Once we have the package_manager:package_name to vendor:product mapping, we can reliably search data from NVD for vulnerabilities in the analysed package.

For some references it's easier to find out the package_manager:package_name. One usually first tries to find out what language is the product written in and then search related package manager portal (like https://www.npmjs.com for javascript). Or if the reference leads to a source code repository (like github), then it's usually quite easy to find out the language. Also the readme can contain install instructions like [npm/pip] install <package> in which case we're done.

Sometimes the references point to various mailing lists, bug/issues trackers - in these cases we can't say what to look for or how to decide whether the component is shipped by any package manager. In such cases the only way is to just search the vendor:product on internet.

For manual CVE texts 'scanning' we can use www.cvedetails.com, which has nice UI. For (semi-)automatic scanning we'd probably use data from NVD.

A manual step-by-step how to find out some mapping from CVE would be:

Examples:

Pypi (python)

Npm (nodejs)

Nuget (.NET)

Maven (java)

More maven (java)

Trying to find out a package_manager:package_name from the CVE references might be quite difficult in some cases (like the java examples, where it's hard to find out that the product is actually in java and then finding correct components in mvnrepository) and hence time consuming. Also in most cases the affected vendor:product isn't shipped by any package manager, therefore does not lead to any useful mapping.

msrb commented 6 years ago

Nice, thanks @jpopelka 😉