CVEProject / automation-working-group

CVE Automation Working Group
https://www.cve.org/ProgramOrganization/WorkingGroups#AutomationWorkingGroupAWG
158 stars 86 forks source link

Standardize on a single format of the date/time values with a time zone specified. #135

Open jayjacobs opened 1 week ago

jayjacobs commented 1 week ago

Currently date/time is represented as 8 different formats in the data, with 66.3% of all specified date/time fields not specifying a timezone. image Many of the variations are from older CVEs, but even in the last few years there are multiple date/time fields with a variety of formats:

  normalized            2020   2021   2022   2023   2024
1 :ss                  57691  62807  73107  26121   7537
2 :ssZ                     0      0      0     23     28
3 :ss[+-]hh:mm           510    871    933   1288   1022
4 :ss.sssZ             19197  21304  29982  89387  97757
5 :ss.sss[+-]hh:mm         0      0      0      8   1890
6 :ss.ssssss             131    271    431   2895   3848
7 :ss.ssssssZ              0      1    156    336      8
8 :ss.ssssss[+-]hh:mm      0      0      0      0      3

The schema currently has two different date/time objects specified, the "datestamp" and "timestamp", but it appears the "timestamp" is the only one referenced and used. It states, "If timezone offset is not given, GMT (+00:00) is assumed." It's fine to not expect the CNA to always specify it, but it should default to UTC (the "Z" in the current formats represent UTC not GMT), AND the timezone should be set for the CNA if not supplied.

I propose that new date/time values be corrected to always have a timezone, and older stored date/time values are updated to a single format and all include a timezone.

jayjacobs commented 1 week ago

Additionally, the "datePublished" field which should be set internally has 3 different formats: image

It looks like the difference is that YYYY-MM-DDThh:mm:ss is used when the time is set to 00:00:00 and the format of YYYY-MM-DDThh:mm:ss.sssZ is used when there is a time value present.