fkie-cad / nvd-json-data-feeds

Community reconstruction of the legacy JSON NVD Data Feeds. This project uses and redistributes data from the NVD API but is neither endorsed nor certified by the NVD.
114 stars 16 forks source link

Data does not validate anymore (2.1 vs 2.0) #15

Closed ostefano closed 7 months ago

ostefano commented 7 months ago

Something odd must have happened a few days ago: new records seem to have been updated to a new format.

See for example: https://github.com/fkie-cad/nvd-json-data-feeds/blob/main/CVE-2021/CVE-2021-30xx/CVE-2021-3007.json There is now a new cveTags field, which is not part of the NVD 2.0 data schema.

Even more interestingly, if you now download the 2.0 NVD schema from https://csrc.nist.gov/schema/nvd/api/2.0/cve_api_json_2.0.schema what you get back is actually version 2.1 cve_api_json_2.1.schema.json (for comparison here the old one cve_api_json_2.0.schema.json )

Format 2.1 does indeed include a new field, but it is named cveTag.

In other words there are three issues:

rhelmke commented 7 months ago

Hey Stefano,

thanks for letting us know :-). Do you plan to contact the NVD regarding this issue?

We planned some things on our side that would also be affected by your observations on API response inconsistency:

rhelmke commented 7 months ago

Hope you don't mind that I pinned this issue

ostefano commented 7 months ago

Hi @rhelmke ,

Not yet, especially because https://services.nvd.nist.gov/rest/json/cves/2.0?cveId=CVE-2021-3007 still returns 2.0 data (no cveTag(s)) and there is no API 2.1 endpoint. In other words I am not entirely sure how the problematic records percolated down and what APIs are affected (and if problematic records are still generated).

What APIs do your automations rely on?

rhelmke commented 7 months ago

Oh I see. We use the same 2.0 API as you mentioned. So it appears that the endpoint emitted this additional data field at some point in time. And our automations do not update this record since it gets deltas by modification timestamp. Affected CVEs obviously didn't receive a timestamp update as only the API response got fixed.

Yeah this basically means that our planned weekly full-syncs with the NVD would fix these records.

Then there's probably also little use in contacting the NVD as they already noticed and fixed the issue. The invalid data in our cache is basically a relic of this fault 🤔

rhelmke commented 7 months ago

FYI: 3390fbf introduces schema validation on push. Example Run

ostefano commented 7 months ago

Nice, I can see that it catches the records with the rogue cveTags field

rhelmke commented 7 months ago

Next week I'll have more time to implement the required functionality that fixes these records. 😃

rhelmke commented 7 months ago

Fixed, see #16 :-)