cisagov / crossfeed

External monitoring for organization assets
https://docs.crossfeed.cyber.dhs.gov
Creative Commons Zero v1.0 Universal
359 stars 54 forks source link

Normalize CPEs #420

Open epicfaace opened 3 years ago

epicfaace commented 3 years ago

For example, intrigue generates CPEs of the form "cpe:2.3:a:apache:http_server::"

Wappalyzer's CPEs look like this: "cpe:/a:apache:tomcat"

In addition to the cpe:/ vs cpe:2.3: difference, sometimes cpes without version numbers have trailing "::"'s and sometimes they don't

This is important, so that we don't generate duplicate services / vulnerabilities

epicfaace commented 3 years ago

In #336, I do a basic regex replace to replace "cpe:2.3:" with "cpe:/" in CPEs generated from Intrigue Ident scans.

epicfaace commented 3 years ago

Okay, perhaps part of the issue is that CPEs can be represented in different forms (see https://cpe.mitre.org/specification/#dictionary):

WFN bound to a URI:
cpe:/o:microsoft:windows_vista:6.0:sp1:~-~home_premium~-~x64~-
WFN bound to a formatted string:
cpe:2.3:o:microsoft:windows_vista:6.0:sp1:-:-:home_premium:-:x64:-