intel / cve-bin-tool

The CVE Binary Tool helps you determine if your system includes known vulnerabilities. You can scan binaries for over 200 common, vulnerable components (openssl, libpng, libxml2, expat and others), or if you know the components used, you can get a list of known vulnerabilities associated with an SBOM or a list of components and versions.
https://cve-bin-tool.readthedocs.io/en/latest/
GNU General Public License v3.0
1.25k stars 466 forks source link

Failing to download CVEs #1116

Closed Adley-Nastri closed 3 years ago

Adley-Nastri commented 3 years ago

When running cve-bin-tool, a SHAMismatch error occurs , leading to the whole rest of the process to completely halt.

 INFO     cve_bin_tool.CVEDB - Downloading CVE data...                                           cvedb.py:199
[17:33:10] ERROR    cve_bin_tool.CVEDB - SHAMismatch:                                              error_handler.py:136
                    https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2017.json.gz (have:
                    9A41A405D604C16E8EE7B04D7DB9812E49F5A93167335F0A954654FF844BE8CC, want:
                    13EA5C65187399E1F91836316952F496F76108810BD475B4C47A0BF91D8C993B)
Downloading CVEs... ------------------- --------------------  48% 0:00:03┌─────────────────────────────── Traceback (most recent call last) ────────────────────────────────┐
│ C:\Program Files\Python38\lib\site-packages\cve_bin_tool-2.1-py3.8.egg\cve_bin_tool\cvedb.py:149 │
│ in cache_update                                                                                  │
│                                                                                                  │
│   146 │   │   │   # exit(100)                                                                    │
│   147 │   │   │   os.unlink(filepath)                                                            │
│   148 │   │   │   with ErrorHandler(mode=self.error_mode, logger=self.LOGGER):                   │
│ > 149 │   │   │   │   raise SHAMismatch(f"{url} (have: {gotsha}, want: {sha})")                  │
│   150 │                                                                                          │
│   151 │   @staticmethod                                                                          │
│   152 │   async def get_curl_versions(session):                                                  │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
SHAMismatch: https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2017.json.gz (have:
9A41A405D604C16E8EE7B04D7DB9812E49F5A93167335F0A954654FF844BE8CC, want:
13EA5C65187399E1F91836316952F496F76108810BD475B4C47A0BF91D8C993B)
Downloading CVEs... ------------------- --------------------  48% 0:00:03

This screen persists and yields nothing.
The tool was working well last week . I have tried the tool on both Windows and Ubuntu under WSL.

imsahil007 commented 3 years ago

After doing a little bit of research I find out there is something wrong with NVD. After refreshing the meta information with a one-second interval I found this output:

lastModifiedDate:2021-03-29T01:00:22-04:00
size:1087082
zipSize:76433
gzSize:76293
sha256:4951BD34DC65248100D328F5637E1B019152DBED1A2374BE37D5DA559CEF9267
lastModifiedDate:2021-03-26T10:00:19-04:00
size:1144916
zipSize:87642
gzSize:87502
sha256:C2312FCF11C5FBDB0E3D832590ABCD243627CA5BD8ED89CA102B9238D429F4BA

Clearly, the first one is right and it is showing meta information for older data. This is the same reason we have been getting SHA mismatch warnings for quite some time.

param211 commented 3 years ago

Yes. The meta and actual files on the data feed aren't in sync. @Adley-Nastri while NVD fixes the issue, you can use the tool with the -u never flag to skip the database update.

keesj-exset commented 3 years ago

Is the tool also not using outdated calls?

https://nvd.nist.gov/General/News/New-NVD-CVE-CPE-API-and-SOAP-Retirement

param211 commented 3 years ago

Whoa! An NVD API definitely changes things!

param211 commented 3 years ago

NVD data feed is back to normal. SHAMismatch errors aren't showing up now.

param211 commented 3 years ago

@terriko we should probably explore switching to the API. From here

The legacy SOAP services will be retired three months after the CVE and CPE APIs are released. The current retirement date is scheduled for June 1st, 2020.

I'm not sure if this covers the json files on the data feeds webpage (SOAP == XML?). But in the API docs they have already started referring to these feeds as legacy feeds

Legacy Data Feeds Historically the same information has been available programmatically via data feeds found at https://nvd.nist.gov/vuln/data-feeds. Consumers of these legacy feeds are encouraged to migrate away from the feeds and use the new services instead. Despite their popularity, the feeds required downloading and processing large files even when a subset of data would suffice. This was time consuming and a burden to networks. In contrast, the new services allow callers to specify query parameters to filter the response. For instance, you may be interested only in vulnerabilities for a certain time period, for specific products, and so on.

param211 commented 3 years ago

I explored the API a bit, the way it's structured we could download all the CVEs at the initial run of the tool and for subsequent runs update only the updated/new CVEs.

Retrieving All CVE Presently NVD contains more than 120,000 vulnerabilities relating to thousands of vendor products. If your goal is to fetch all records, then multiple consecutive requests are required. For example, 120+ requests for 1,000 results per page. However, NIST firewall rules in place to prevent denial of service attacks on NVD can thwart your application. To avoid this, it is recommended that your application sleeps for several seconds between requests in order that your legitimate requests are not denied. In addition, applications are discouraged from repeatedly requesting all the records every day. Rather, download all of them initially, then use date range parameters to retrieve new and recently modified records since your last request. See CVE by Date Range.

This might help with the rate-limiting issue too. #1081

terriko commented 3 years ago

Switching to the new setup sounds like a good plan. @param211 do you mind opening a separate github issue with your research on that? Does this look like a few days of work or a whole GSoC project worth of work to you?

param211 commented 3 years ago

@terriko sure I'll open a new issue. At first glance, it looks like only some parts of cvedb.py will need to be rewritten. And the JSON structure is unchanged. Probably can get it done within this week.

terriko commented 3 years ago

I think the issue with NVD that caused this bug report is fixed, and we're tracking the plan to use NVD's new api in #1125 so I'm going to go ahead and close this particular issue.