CVEProject / cvelistV5

CVE cache of the official CVE List in CVE JSON 5 format
735 stars 163 forks source link

Published date changed on thousands of CVEs #41

Closed jayjacobs closed 2 weeks ago

jayjacobs commented 7 months ago

Making a plot of published dates in the JSON 5.0 has two weird jumps in it. image

image

MrGallow88 commented 5 months ago

Making a plot of published dates in the JSON 5.0 has two weird jumps in it. image

image

mprpic commented 5 months ago

This is just a guess, but Oct 3, 2022 is the date when CVE Services 2.1 was released and upgraded the entire CVE data set from CVE JSON schema v4 to v5. If you look at the v4 schema, there aren't really any enforced fields for publish dates:

https://github.com/CVEProject/cve-schema/blob/master/schema/v4.0/CVE_JSON_4.0_min_public.schema

So it's likely that during the migration if the field was not set for some CVEs, it was auto-set to the current date. Obviously those dates may not necessarily reflect when those CVEs were actually published.

jayjacobs commented 5 months ago

@mprpic It's undoubtedly from the change to v5.

But I just noticed something else strange. I went back and grabbed data I had collected on October 20th, 2023 from the legacy (v4) data feed that leveraged the CSV and XML data sources. Then I loaded data from April 19th, 2024 (v5 json) and 45,363 published dates have been modified. 10,105 of those are set to "2022-10-03" and 434 are "2018-04-26" and it goes down from there, there are 3,201 unique destination published dates. If I filter out those that changed more than 7 days I get this plot: image

Notice at the bottom there are some CVEs where the published dated shifted an enormous amount to the left, take a look at the (clearly incorrect) dates on the following sampled CVEs, in the new data they are now showing up as published years before the year in the CVE ID.

hkong-mitre commented 5 months ago

As @mprpic noted, this is indeed from the transition from V4 to V5. At the weekly Automation Working Group meeting today, @jayjacobs and others also noted the importance of rectifying the affected CVEs. The good news is that this is being worked on, and we expect to deploy the fixed CVEs in the near future.

jayjacobs commented 5 months ago

I imagine this is related, but there are a handful of published CVEs with no published date at all:

   cve            cveMetadata.datePublished cveMetadata.state
 1 CVE-2018-10631 NA                        PUBLISHED        
 2 CVE-2021-21045 NA                        PUBLISHED        
 3 CVE-2021-21084 NA                        PUBLISHED        
 4 CVE-2021-25741 NA                        PUBLISHED        
 5 CVE-2021-36004 NA                        PUBLISHED        
 6 CVE-2021-36063 NA                        PUBLISHED        
 7 CVE-2021-39862 NA                        PUBLISHED        
 8 CVE-2021-39865 NA                        PUBLISHED        
 9 CVE-2021-40700 NA                        PUBLISHED        
10 CVE-2021-40701 NA                        PUBLISHED        
11 CVE-2021-40702 NA                        PUBLISHED        
12 CVE-2021-40703 NA                        PUBLISHED        
13 CVE-2022-28837 NA                        PUBLISHED        
14 CVE-2022-28838 NA                        PUBLISHED        
15 CVE-2022-28851 NA                        PUBLISHED        
16 CVE-2022-30660 NA                        PUBLISHED        
MrGallow88 commented 5 months ago

Making a plot of published dates in the JSON 5.0 has two weird jumps in it.

image

  • There are 10,151 CVEs with the "datePublished" value on 2022-10-03. They have different times in the data, but all within that day.

  • 9,921 of those also have the same date in the "last modified" field.

  • For example, when I look at CVE-2010-1124 in the old website, it has "date record created" as "20100326", but in the new website it claims both published and updated on 2022-10-03.

  • Random sample of CVEs published on Oct 3, 2022: CVE-2015-8758, CVE-2010-2983, CVE-2008-7279, CVE-2002-2386, CVE-2009-2610, CVE-2018-1999030, CVE-2018-1000198, CVE-2018-20371, CVE-2005-4691, CVE-2003-1567, CVE-2011-4391, CVE-2018-14492, CVE-2013-4487, CVE-2013-0317, CVE-2009-3256, CVE-2004-2712, CVE-2018-8710, CVE-2005-1652, CVE-2018-16978, CVE-2011-4771

  • When I compared an earlier version of the plot (below, using the legacy data), I also noticed a bump of 3,122 CVEs on 2017-05-11. Haven't dug into it, but I imagine it's similar?

image

Need advice

M-nj commented 2 weeks ago

Thank you all for the in depth analysis of this issue.

Corrections to 27373 affected CVE IDs rolled out between 2024-09-16T16:12:10.2938791Z and 2024-09-17T04:33:32.9203758Z.

Corrected fields include:

The first commit SHA containing changes from this set of corrections is b1e7ed0725666c91745a8366baf542fb57dd7a50, and the last commit SHA containing changes from this set of corrections is c63570fcda98dfab2659eae007da81512473c6c0. Please note that some CVE ID record updates during this window are not from correcting these records, but rather regular new/update activity.

Additional issues such as missing fields (e.g. cveMetadata.datePublished and cveMetadata.dateReserved) may still persist at this time, but a fix will be forthcoming.

Please reply to this message if any additional issues are found relating to the corrections that have already been made.