nyph-infosec / daggerboard

MIT License
95 stars 19 forks source link

Use LLM to determine whether a version is affected? #22

Closed wayward710 closed 11 months ago

wayward710 commented 12 months ago

Wondered if there would be any merit to using a Large Language Model (LLM) to parse CVEs to determine which versions of software were affected. For example, the version information for the 2021 Log4J vulnerability (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44832) looks like this: "Apache Log4j2 versions 2.0-beta7 through 2.17.0 (excluding security fix releases 2.3.2 and 2.12.4) are vulnerable to a remote code execution (RCE) attack...." That might be hard to scan automatically. But I've experimented with ChatGPT and Google Vertex and they seem to be capable of handling that pretty well. If there's any interest in pursuing this, I'd be happy to try to help. Thanks!

namtarb commented 11 months ago

Hi, @wayward710, thank you for your suggestion. We haven't considered this since many LLMs require a paid license and Daggerboard is an open-source project. Currently, our CVE feeds come from standardized sources, like the NIST NVD API. We are open to new ideas and collaboration and welcome any attempts to integrate LLM enrichment as an optional feature if you would like to contribute.